ifo^atUb 


A PRELIMINARY  CALIBRATION  OF  THE  RCA 


PRICE  S SOFTWARE  COST  ESTIMATION  MODEL 


Thesis 


John  Schneider,  IV 
Captain  USAF 


AFIT/6SM/SM/77S-15 


DISTRIBUTION  STATEMENT  A 

Approved  fax  public  release; 
Distribution  Unlimited 


Presented  to  the  Faculty  of  the  School  of  Engineering 
of  the  Air  Force  Institute  of  Technology 
Air  University 

In  Partial  Fulfillment  of  the 
Requirements  for  the  Degree  of 
Master  of  Science 


John  Schneider,  IV 
Captain  USAF 

Graduate  Systems  Management 

September  1977 


Approved  for  public  release;  distribution  unlimited 


PREFACE 


During  the  nine  years  that  I have  been  In  the  Air  Force,  I have  been 
associated  with  software  development  projects  In  a number  of  capacities: 
prospective  user,  programmer,  systems  analyst,  and  manager.  This  exper- 
ience, supported  by  similar  findings  In  the  professional  literature,  has 
convinced  me  that,  given  good  people.  It  Is  the  early  decisions  that  most 
often  determine  the  success  or  failure  of  a software  development  project. 
These  decisions  are  often  made  In  an  environment  where  there  is  signifi- 
cant pressure  to  make  the  project  look  attractive.  This  pressure  can 
only  be  withstood  by  professional  software  managers  who  have  a sound  body 
of  wplrlcal  knowledge  on  which  to  base  their  project  estimates.  Today, 
there  are  a large  number  of  extremely  competent  software  managers.  The 
empirical  knowledge  necessary  for  then  to  make  good  estimates,  however, 

Is  still  being  developed  through  the  continuing  efforts  of  many  competent 
and  dedicated  researchers.  This  research  was  undertaken  In  an  effort  to 
contribute  to  the  development  of  that  essential  body  of  knowledge. 

I would  like  to  take  this  opportunity  to  thank  the  many  people  who 
contributed  to  this  effort.  Dr.  Frank  Frelman  and  Dr.  Robert  Park  of 
RCA  PRICE  Systems  deserve  special  thanks,  both  for  allowing  me  to  use  the 
PRICE  S model  and  for  their  many  valuable  Insights  Into  both  PRICE  S and 
cost  estimating  In  general.  Many  thanks  also  are  due  to  Ms.  Freda  Kurtz 
of  the  Air  Force  Avionics  laboratory.  Ms.  Kurtz  helped  formulate  the 
project  and  arranged  for  the  computational  resources  necessary  to  complete 
the  research.  In  addition,  I would  like  to  thank  the  many  ASD  personnel 
who  took  time  from  their  busy  schedules  to  support  this  effort.  Of  course. 


s 


•s 


11 


TABLE  OF  CONTENTS 


• : 

Preface.  . . . 
list  of  Figures 
List  of  Tables 
Abstract  . . . 


I INTRODUCTION 1 

The  Cost  of  Software  1 

The  Need  for  Software  Cost  Estimates  2 

Capital  Budgeting  Decisions  2 

Source  Selection  Decisions 2 

Project  Scheduling 3 

System  Design  Trade-Off's  3 

Performance  Evaluation 3 


II 


BACKGROUND 


5 


The  Software  System  Life-Cycle  

Conceptual  Stage 

Validation  Stage 

Full-Scale  Development 

Maintenance  Stage  

The  Software  Development  Process  

Design  Phase.  . . . 1 

The  Implementation  Phase 

The  Test  and  Integration  Phase 

Systems  Engineering  

Programming  

Documentation  

Configuration  Management 

Current  Concepts  In  Software  Cost  Estimation  . . 
Methods  of  Estimating  Software  Development  Costs 

The  Measurement  Problem  

Factors  Affecting  Software  Development  Costs 

The  Cost  of  Maintenance  

An  Introduction  to  PRICE  S 


Ill  METHODOLOGY 

The  Research  Problem 

Statement  of  the  Problem  . . . 

Purpose 

Objectives  . 

Scope  and  Limitations 

Assumptions 

Data  Collection  

The  Initial  Interview 

Cost/ Schedule  Data  Collection. 
Statistical  Data  Collection.  . 

Methods  of  Analysis  

Subjective  Evaluation 

Statistical  Analysis  

PRICE  S Methodology 

Sensitivity  Analysis  Techniques 
Systems  Descriptions 


IV  ANALYSIS  AND  RESULTS  

Overview  of  the  Systems  Studied  

System  A 

System  E 

Evaluation  of  the  Data  Collection  Methodology 

The  Initial  Interview 

Cost/Schedule  Data  Collection 

Statistical  Da '‘a  Collection 

System  A Adjustments  

System  E Adjustments  

Sensitivity  Analysis  


V  SUMMARY  AND  CONCLUSIONS 

Conclusions  

Conclusions  Relating  to  the  Methodology 
Conclusions  Relating  to  PRICE  S.  . . . 

Bibliography 

Appendix  A 

A Glossary  of  PRICE  S Terminology.  


Appendix  B 

The  Initial  Interview  Format 


Appendix  C 

Derivation  of  the  Sequential  Sampling  Equation 
Appendix  D 

System  Descriptions  

Vita 


LIST  OF  FIGURES 


Figure  Page 

1 PRICE  S Resource  Expenditure  Curves  11 

2 Frequency  Distribution  for  Instructions  per  Module  - 

System  A 56 

3 Frequency  Distribution  for  Instructions  per  Module  - 

System  E 57 

\ 

4 Frequency  Distribution  for  Module  Weight  - System  A 58 

5 Frequency  Distribution  for  Module  Weight  - System  E 59 

6 TNINST  vs.  Cost  - System  A 67 

7 TNINST  vs.  Cost  - System  E 67 

8 Total  Instructions  vs.  Schedule  - System  A 68 

9 Total  Instructions  vs.  Schedule  - System  E 68 

10  Development  Cost  vs.  Instruction  Density.*  System  A 69 

11  Development  Cost  vs.  Instruction  Density  - System  E 69 

12  Schedule  Duration  vs.  Instruction  Density  - System  A 70 

13  Schedule  Duration  vs.  Instruction  Density  - System  E 70 

14  Development  Cost  vs.  Capacity  Utilization  - System  A 71 

15  Development  Cost  vs.  Capacity  Utilization  - System  E 71 


16  Schedule  Duration  vs.  Capacity  Utilization  - System  A ...  . 72 

17  Schedule  Duration  vs.  Capacity  Utilization  - System  E . . . . 72 


2 

- 


i 


m 


vl  1 


LIST  OF  TABLES 

Table 

I Impact  of  Factors  on  Software  Cost  Estimation 

II  Exponential  Relationships  Between  Development  Effort  and 

System  Rate 

III  PRICE  S Application  Types 

IV  Summary  of  Sample  Size  Calculations . . 

V Comparison  of  Alternative  Methods  of  Computing  "Mix"  . . 

VI  Comparison  of  "Interactive"  Modules  From  Two  Systems  . . 

VII  Baseline  PRICE  S Parameters 

VIII  Sensitivity  Analysis  Results  


viii 


AFIT/GSM/SM/77S-1 5 


ABSTRACT 

© \| 

Each  year,  the  Department  of  Defense  spends  more  than  three  billion 
dollars  on  computer  software,  yet  software  managers  are  notoriously  unable 
to  predict  the  cost  of  software  development  projects.  This  is  especially 
true  of  preliminary  cost  estimates  made  during  the  formative  stages  of  a 
project.  Even  when  parametric  relationships  are  used,  such  estimates 
depend  heavily  on  anology  with  previously  developed  systems.  The  purpose 
of  this  research  is  to  investigate  ways  of  gathering  and  using  descrip- 
tive data  for  the  purpose  of  making  preliminary  software  cost  estimates. 

A methodology  for  the  collection  of  descriptive  information  on  software 
systems  was  developed  and  used  to  describe  several  avionics  software 
systems.  The  data  thus  gathered  was  then  used  to  "calibrate"  the  PRICE  S 
software  cost  estimation  model  by  relating  particular  values  of  several 
w "subjective"  PRICE  S input  parameters  to  the  observed  software  system 

data.  It  was  found  that  certain  characteristics  of  software  systems 
could  be  objectively  measured,  and  that  the  PRICE  S model  is  not  incom- 
patible with  avionics  software  systems  developed  for  the  Aeronautical 
Systems  Division  of  Air  Force  Systems  Command. 

\ 


0 


A PRELIMINARY  CALIBRATION  OF  THE  RCA  PRICE  S 


SOFTWARE  COST  ESTIMATION  MODEL 


I.  INTRODUCTION 


The  past  few  years  have  been  a period  of  intensive  change  in  the 
area  of  computer  software  develooment.  This  is  true  both  within  the 
Department  of  Defense  (DOD)  in  particular  and  within  the  software  com- 
munity as  a whole.  The  concept  of  software  engineering  has  gained  wide 
acceptance,  and  seems  to  be  well  on  its  way  to  beconming  a mature  disci- 
pline. Concurrently,  software  professionals  have  developed  a deeper 
understanding  of  what  software  systems  are,  and  how  they  should  be  devel- 
oped. Yet,  even  as  the  tools  and  techniques  required  to  develop  software 
systems  have  evolved,  the  inability  of  software  development  managers  to 
estimate  the  cost  of  a given  project  has  remained  notoriously  poor.  This 
failure  to  develop  accurate  cost  estimating  techniques  has  important 
consequences  for  both  software  developers  and  for  their  potential  customers. 

The  Cost  of  Software 

In  1976,  the  Deputy  Assistant  Secretary  of  Defense  (Material  Acquisi- 
tion) estimated  that  annual  DOD  expenditures  for  computer  software  exceeded 
three  billion  dollars  (Gansler,  1976:1).  On  the  civilian  side,  Boehm 
estimated  that  expenditures  for  software  in  the  United  States  in  1976 
would  exceed  15  billion  dollars  (Boehm,  1975:4).  While  there  may  be  some 
overlap  and  Inaccuracy  In  these  two  estimates,  they  unquestionably  document 
the  overall  size  of  the  software  Industry  in  the  United  States.  Even  so. 


0 


Q 


such  estimates  only  reflect  the  resources  consumed  in  the  development  and 
maintenance  of  software.  They  do  not  purport  to  measure  the  opportunity 
costs  of  resources  wasted  because  of  schedule  si  ^*9**,  uneconomical 
assignment  of  resources,  and  general  Inability  to  correctly  guage  the  con- 
sequences of  possible  future  courses  of  action. 

The  Need  for  Software  Cost  Estimates 

There  are  a nianber  of  reasons  Why  accurate,  timely  estimates  of 
total  cost  are  essential  to  the  management  of  all  large  projects.  Inclu- 
ding those  that  Include  computer  software  development.  Some  of  the  key 
decisions  which  depend  on  cost  estimates  Include:  capital  budgeting 

• decisions,  source  selection  decisions,  project  scheduling,  system  design 

* trade-off's,  and  performance  evaluation.  The  Impact  of  cost  estimates  on 
each  of  these  decisions  Is  discussed  briefly  below. 

Capital  Budgeting  Decisions.  The  decision  to  undertake  a specific 
project  is  usually  made  on  the  basis  of  some  kind  of  comparison  between 
expected  costs  and  benefits.  Even  If  the  considerations  Involved  are 
highly  subjective  In  nature,  some  estimate  of  the  project  duration  and 
the  manpower  (or  other  critical  resources)  required  must  normally  be  made 
before  the  project  Is  approved.  Thus  some  form  of  cost  estimate  Is 
required  prior  to  project  approval. 

Source  Selection  Decisions.  Once  a project  has  been  approved.  It  Is 
necessary  to  select  the  organization  that  will  actually  develop  the  soft- 
ware. This  may  be  an  "In-house"  software  development  team,  or  It  may  be 
an  Independent  contractor.  In  either  case.  It  Is  necessary  to  be  able  to 
accurately  estimate  the  project  costs  for  differing  development  options. 
This  Is  especially  true  since  unrealistically  low  bids  are  not  uncomnon 

2 


o 


o 


In  the  software  Industry.  Such  bids  may  result  from  technical  naivete  on 
the  part  of  the  bidder,  or  from  an  attempt  to  "boy  In"  on  the  contract. 

In  either  case.  It  Is  necessary  that  unrealistically  low  bids  be  Identified 
as  such.  Otherwise  a realistic  source  selection  decision  cannot  be  made. 

Project  Scheduling.  The  development  of  detailed  project  schedules 
requires  the  ability  to  estimate  resource  requirements.  In  essence,  each 
work  package  In  the  schedule  becomes  a project,  the  requirements  for  which 
must  be  estimated  Independently.  Thus,  the  quality  of  the  project  schedule 
depends  directly  on  the  project  manager's  ability  to  accurately  estimate 
the  requirements  of  each  component  of  the  project. 

System  Design  Trade-Off's.  Very  few  system  acquisitions  Involve  a 
pure  software  development.  Most  often,  the  software  system  Is  "embedded" 

In  a larger  system  which  Is  being  developed  as  a whole.  In  such  cases. 

It  Is  not  unusual  to  find  that  certain  system  functions  might  be  performed 
equally  well  by  either  hardware  or  software.  In  such  cases,  the  decision 
to  use  hardware  or  software  often  hinges  on  the  estimated  costs  of  each 
approach.  In  some  cases,  such  decisions  can  radically  alter  the  architec- 
ture of  the  entire  system. 

Performance  Evaluation.  In  many  cases,  the  closest  available  approxi- 
mation to  an  objective  standard  of  project  performance  Is  the  project 
schedule,  and  the  associated  resource  requirement  (cost)  estimates,  "Good 
management"  Is  normally  associated  with  projects  that  achieve  their  ob- 
jectives ahead  of  schedule  and  "below  cost."  Yet,  as  stated  above,  the 
schedule  Itself  Is  directly  dependent  on  the  ability  of  the  responsible 
managers  to  accurately  estimate  resource  requirements.  Thus,  realistic 
performance  evaluation  Is  often  directly  linked  to  the  quality  of  resource 
estimates. 


O 


3 


> ' ' \ ^ I 

From  the  above.  It  Is  apparent  that  the  ability  to  obtain  realistic 
estimates  of  the  resources  required  to  complete  a given  project  Is  neces- 
sary  for  any  software  development  manager  to  be  effective.  Unfortunately, 
the  state-of-the-art  of  software  cost  estimating  has  not  yet  advanced  far 


enough  to  meet  these  needs.  Therefore,  there  has  been  a continuing  search 
for  new  and  better  methods  of  estimating  software  costs.  The  research 

. : 

documented  here  Is  a small  part  of  that  search. 


O 


4 


II.  BACKGROUND 


o 

The  purpose  of  this  chapter  is  to  Introduce  the  reader  to  the  RCA 
PRICE  S software  cost  estimating  model.  Before  this  can  be  done,  however. 
It  Is'necessary  to  place  the  model  In  perspective.  To  do  this.  It  Is 
necessary  to  briefly  discuss  three  subjects:  the  software  system  life- 
cycle, the  software  development  process,  and  the  current  literature  on  the 
subject  of  software  cost  estimation. 


The  Software  System  Life-Cycle 

The  term  "system  life-cycle"  Is  used  to  describe  the  stages  through 
which  any  man-made  system  passes.  The  system  Is  first  conceived  as  an 
Idea.  Then  the  Idea  Is  transformed  Into  physical  reality  and  used. 

Finally,  the  system  Invariably  outlives  Its  usefulness  and  Is  discarded. 
Air  Force  Regulation  (AFR)  800-14,  "Computer  Resources  Acquisition  and 
Support"  refers  to  five  discrete  stages  In  the  life-cycle  of  Air  Force 
systems.  These  are  termed  the  conceptual,  validation,  full-scale  develop- 
ment, production,  and  deployment  stages.  Any  system  may  be  defined  In 
terms  <f  these  five  stages  Including  software  systems  (Department  of  the 
Air  * »,  1974:2-1  through  2-6). 

point  which  must  be  made  clear  Is  that  a given  software  system 
can  through  the  entire  life-cycle  while  the  "supra-system"  of  which  It 
Is  a part  remains  In  one  single  stage.  For  example,  the  software  that 
must  be  developed  and  used  to  test  a prototype  aircraft  may  be  completely 
replaced  with  new  "operational"  software  during  the  full-scale  development 
phase  of  the  aircraft's  life-cycle.  The  following  discussion  refers  to  the 
stages  of  the  life-cycle  of  a software  system.  No  reference  Is  Intended 
to  the  stage  of  the  supra-system. 


i 


i 


5 


o 


o 


software  can  be  neither  "used  up"  nor  "broken."  What  actually  happens  to 
a software  system  during  the  last  stage  of  Its  life-cycle  Is  that  It  Is 
modified  on  a more  or  less  continuous  basis  to  reflect:  corrections  for 
errors,  enhancements  of  capability*  changes  In  the  operating  environment, 
and  changes  In  the  users'  perception  of  what  the  system  should  accomplish. 
Almost  Invariably,  these  modifications  are  performed,  or  at  least  con- 
trolled, by  a single  organization  In  order  to  keep  all  copies  of  the 
software  system  Identical.  Within  the  software  community,  this  process 
of  controlled  modification  of  software  systems  Is  universally  referred  to 
as  maintenance.  Thus,  within  the  context  of  software  systems.  It  Is  more 
meaningful  to  discuss  a maintenance  stage  than  a deployment  stage. 

Given  the  above.  It  Is  reasonable  to  discuss  the  life-cycle  of  a 
software  system  In  terms  of  four  stages:  conceptual,  validation,  full- 
scale  development,  and  maintenance. 

Conceptual  Stage.  Every  system  evolves  from  an  Idea.  The  Identifi- 
cation of  "good”  Ideas  and  the  expansion  of  these  Ideas  Into  a formal, 
approved  statement  of  user  requirements  characterizes  this  phase.  AFR 
800-14  states  that  "the  major  definitive  document  resulting  from  this 
phase  Is  the  Initial  system  specification..."  (Department  of  the  Air 
Force,  1974:2-2).  It  Is  during  this  phase  that  the  major  decisions  con- 
cerning what  the  system  will  do,  where  It  will  fit  Into  the  general 
scheme  of  things,  and  how  It  will  work-  are  made  and  documented. 

Validation  Stage.  During  this  stage,  the  system  requirements  are 
developed  In  detail,  and  the  ramifications  of  acquiring  the  system  are 
evaluated.  In  the  Air  Force,  the  "authenticated  system  specification" 

Is  developed,  and  the  feasibility,  cost,  risk,  and  schedule  of  system 


7 


development  are  examined  In  detail  (Department  of  the  Air  Force*  1974:2-2). 
When  this  phase  Is  complete*  the  developing  agency  Is  authorized  to  pro- 
ceed with  full-scale  development. 

Full-Scale  Development.  This  Is  the  stage  which  encompasses  the 
actual  creation  of  the  complete  software  system.  For  most  systems  of 
significant  size*  this  phase  Involves  much  more  than  the  production  of 
computer  code.  In  order  for  the  final  product  to  be  useful*  most  or  all 
of  th*  ft**'  • ing  must  be  completed  In  an  acceptable  manner:  computer  pro- 
grams, b station,  user  manuals,  user  training*  programmer 


grams,  b station,  user  manuals,  user  training,  programmer 

manuals*  . programmer  training*  test  plans  and  procedures,  and 

te'.t  results.  In  addition,  for  most  operational  software  systems,  a 
number  of  nonopera tlonal  (support)  computer  programs  must  be  developed 
to  facilitate  the  testing  and  maintenance  of  the  operational  software 
system.  The  full-scale  development  stage  Is  complete  when  the  using 
agency  accepts  the  system  for  routine  operational  use  (Department  of  the 
Air  Force,  1974:2-2  through  2-3). 

Maintenance  Stage.  The  maintenance  stage  begins  during  the  full- 
scale  development  stage  and  extends  throughout  the  remainder  of  the 
system  life-cycle.  Maintenance  begins  when  a computer  program  has  been 
completely  coded.  From  this  time  forward.  It  will  undergo  a series  of 
modifications  that  will  end  only  when  the  program  Is  discarded.  At 
first  these  modifications  are  controlled  by  the  Individual  programmer 
(debugging),  later  by  the  developing  agency  (system  test  and  Integration), 
and  finally  by  the  operational  user  (routine  operations).  In  any  case 
the  Intent  of  maintenance  Is  the  same:  to  modify  the  system  so  as  to 
make  It  operate  more  closely  In  accordance  with  current  system  objectives. 


C 


8 


In  essence,  the  maintenance  stage  constitutes  a continuing  process  of 
refinement  which  permits  the  system  to  grow  and  adapt  to  a changing  oper 
atlonal  environment. 


The  Software  Development  Process 


During  both  the  full-scale  development  and  the  maintenance  stages 
of  the  software  system  life-cycle,  new  software  Is  developed.  Depending 
on  the  situation.  It  may  be  most  convenient  to  view  "the  software  being 
developed"  as  encompassing:  the  entire  system,  a single  program,  or 
possibly  a single  subroutine.  In  any  case,  the  Idea  that  the  process  by 
which  software  1$  developed  Is  Independent  of  the  particular  problem  has 
strong  Intuitive  appeal. 

The  model  of  the  software  development  process  which  Is  described 
below  Is  the  one  on  which  the  RCA  PRICE  S model  Is  based.  It  must  be 
noted,  however,  that  RCA  PRICE  S Systems,  Inc.  Intentionally  leaves  a 
wide  degree  of  flexibility  In  the  definitions  of  model  terminology.  This 
approach  Is  Intended  to  allow  Individual  model  users  a degree  of  latitude 
In  adapting  the  model  to  their  own  accounting  system.  As  a result,  the 
specific  definitions  and  Interpretations  given  below  are  the  responsi- 
bility of  this  author,  while  the  overall  concept  Is  the  creation  of 
Dr.  Frank  Frelman  and  his  associates  at  PRICE  Systems. 

According  to  Frelman,  the  software  development  process  can  be  de- 
scribed In  terms  of  three  overlapping  phases,  during  which  five  activities 
or  tasks,  are  performed.  The  three  phases  are  termed:  "design,"  "Imple- 
mentation," and  "test  and  Integration."  The  five  activities  are  "systems 
engineering,"  "programing,"  "configuration  management,"  "documentation," 
and  "program  management."  All  five  of  these  activities  are  performed 


9 


during  all  of  the  phases*  although  the  amount  of  each  activity  varies 
o from  phase  to  phase  (Frelman,  1977). 

The  general  pattern  of  resource  expenditure  for  each  of  the  phases 
Is  shown  In  Figure  1.  The  most  significant  characteristics  of  the  phases 
are  that  they  have  specific  starting  and  ending  points,  that  they  overlap, 
and  that  they  Interact.  The  Interaction  between  phases  Is  such  that  If 
one  phase  begins  or  ends  at  the  "wrong"  time,  the  resources  that  need  to 
be  expended  In  other  phases  Is  Increased  (Frelman.  1977). 

Design  Phase.  The  design  phase  begins  when  the  agency  which  Is  to 
actually  develop  the  software  system  Is  authorized  to  proceed  with  full- 
scale  development.  During  this  phase,  system  architecture  Is  determined, 
requirements  are  allocated  to  system  components  (programs),  and  the 
Individual  programs  are  designed  In  detail.  Each  program  moves  from  the 
design  to  the  Implementation  phase  separately,  as  It  Is  "released  to 
code."  Proposed  changes  to  the  current  system  baseline  are  also  con- 
sidered to  be  In  the  design  phase  until  they  are  approved.  When  changes 
are  approved,  they  either  take  on  the  status  of  the  Individual  program(s) 
Involved,  or  they  force  the  program(s)  back  to  the  design  phase.  Of 
course  changes  can  also  create  and  delete  entire  programs  or  groups  of 
programs.  Newly  created  programs  enter  the  design  phase. 

The  Implementation  Phase.  The  Implementation  phase  begins  when  the 
first  program  Is  "released  to  code"  and  continues  until  the  last  program 
Is  formally  accepted  for  testing.  During  this  phase,  each  programmer 
codes  and  debugs  his  program,  documents  the  "as  built"  design,  and  turns 
over  responsibility  for  the  program  to  the  test  and  Integration  team. 

The  Implementation  phase  Is  generally  the  smallest  of  the  three  phases  In 
terms  of  total  resource  expenditure. 


■< 

l 


9 

i 


'■1 

1 

> 

I 


10 


The  Test  and  Integration  Phase.  This  phase  begins  when  the  core  of 
the  test  and  Integration  team  is  formed.  Normally,  this  Is  at  the  point 
where  system  design  has  progressed  far  enough  for  test  planning  to  being. 


but  well  before  any  formal  testing  Is  required.  During  this  phase,  two 

I 

different  but  Inseparable  functions  are  performed.  First,  the  Individual 
programs  are  fitted  into  place  as  part  of  the  overall  system.  Secondly, 
the  system  and  each  component  of  the  system  are  tested  to  Insure  that 
each  required  function  is  performed,  and  that  the  system  functions 
properly  as  a whole.  When  a programner  certifies  that  his  program  Is 
ready  for  formal  testing,  that  program  enters  the  testing  and  Integration 
phase.  This  phase  ends  when  the  system  Is  turned  over  to  the  user  for 
routine  operations. 

Throughout  the  software  development  process,  five  activities  are 
performed,  all  of  which  are  essential  to  the  success  of  the  project.  As 
with  the  phases,  the  dividing  line  between  the  activities  Is  rather  arbi- 
trary, and  different  situations  may  warrant  different  boundary  defini- 
tions. Still,  the  main  thrust  of  each  activity  Is  quite  clear  and  they 
may  be  easily  discussed  In  turn. 

Systems  Engineering.  This  activity  encompasses  the  technical  tasks 
which  are  concerned  with  the  system  as  a whole.  During  the  design  phase, 
this  Includes  developing  precise  system  specifications,  dividing  the 
system  Into  component  programs  and  defining  the  associated  Interfaces,  and 
allocating  requirements  to  the  separate  components.  During  the  Imple- 
mentation phase.  It  Includes  the  review  of  specific  components  to  Insure 
that  they  meet  system  objectives  and  a continuing  evaluation  of  the  system 
through  analysis  and  simulation  to  minimize  the  risk  of  not  meeting  system 

o 


1 


1 


i 

I 


3 


12 


1 


(■  ■■ 
r ■ 


\ ' 

i 


I 


i 


requirements.  Finally,  during  the  test  and  Integration  phase  the  systems 
engineering  task  encompasses  system  oriented  testing  and  problem  resolu- 
tion. 

Programing.  Programing  Is  the  central  activity  on  any  software 
development  project.  In  that  It  is  the  activity  that  directly  creates 
working  computer  programs.  It  encompasses  the  work  necessary  to  design, 
code,  and  test  Individual  programs.  In  the  areas  of  designing  and  testing, 
where  the  distinction  between  systems  engineering  and  programming  Is 
necessarily  rather  obscure,  the  distinguishing  criteria  is  whether  a 
particular  task  is  oriented  toward  an  Individual  program  or  toward  the 
system  as  a whole.  During  the  design  phase,  the  majority  of  the  program- 
ming work  Is  Individual  program  design.  During  the  Implementation  phase, 
the  emphasis  shifts  to  individual  programmers  coding  and  debugging  their 
programs.  Finally,  during  the  test  and  Integration  phase,  programming 
activities  Include  formal  testing  and  maintenance  programming. 

Documentation.  The  primary  tangible  product  of  a software  develop- 
ment project  Is  documentation.  Indeed,  It  sometimes  seems  as  if  the 
generation  of  printed  material  Is  the  sole  purpose  of  the  project.  If 
all  that  was  required  for  the  documentation  activity  was  drafting.  It 
could  probably  be  Included  In  the  programming  and  systems  management 
activities.  Unfortunately,  this  Is  not  the  case.  Documentation  must  not 
only  be  drafted.  It  must  also  be  edited,  printed,  distributed,  reviewed, 
corrected,  and  updated.  For  projects  of  any  size,  this  Is  a task  of 
significant  magnitude. 

Configuration  Management.  This  activity  Involves  the  determination, 
at  all  times,  of  precisely  what  Is,  and  Is  not,  an  approved  part  of  the 


O 


13 


r 


system.  To  accomplish  this.  It  Is  necessary  to  perform  three  tasks.  The 
first  task  Involves  Incorporating  specifications  Into  the  "system  base- 
line." This  may  be  viewed  as  formally  declaring  a particular  document  as 
being  "officially  correct."  Once  a document  has  been  Incorporated  Into 
the  baseline,  changes  may  only  be  made  through  the  configuration  control 
task.  This  taks  Involves  the  evaluation  of  alleged  deficiencies  in  and 
proposed  changes  to  the  system  baseline.  Finally,  It  is  necessary  to 
provide  for  the  dissemination  and  control  of  approved  baseline  material. 

■ . 

The  conflicting  needs  for  both  easy  access  to  baseline  information  and 

positive  control  of  all  changes  to  the  baseline  often  severely  complicate 

I 

this  task. 

Program  Management.  As  with  any  other  organized  human  effort,  soft- 

I I 

ware  development  projects  must  be  managed.  This  activity  Includes  the 
supervisory,  financial,  legal,  and  general  acknlnlstratlve  tasks  necessary 
to  plan,  organize,  direct  and  control  the  project.  Ideally,  program 
management  is  the  Integrating  activity  that  coordinates  the  other  activi- 

| 

ties  Into  a single,  coherent  effort. 

I 

Current  Concepts  In  Software  Cost  Estimation 

According  to  Clapp,  "cost  estimation  can  be  defined  as  predicting 

. 

the  cost  of  resources  needed  to  carry  out  a process  which  delivers  a speci- 
fied set  of  products."  (Clapp,  1976:5)  In  the  case  of  computer  software 
systems,  the  resources  consist  largely  of  manpower  and  computer  time  costs. 
Of  course  the  relevant  process  Is  the  software  development  process  de- 
scribed above.  Finally,  the  product  generally  consists  of  "operational" 
(l.e.,  tested  and  formally  accepted)  computer  software  and  software  docu- 
mentation. It  Is  customary  to  estimate  separately  such  associated  costs 


14 


as  supplies  and  travel  as  well  as  additional  tasks  such  as  user  training. 
(Clapp,  1976:6) 

In  most  cases,  a cost  estimate  should  include  all  of  the  relevant 
costs  that  would  be  Incurred  under  the  assumed  conditions  throughout  the 
entire  life-cycle  of  the  system.  For  all  practical  purposes  this  includes 
the  costs  of  full-scale  development  and  maintenance.  Most  of  the  work  In 
the  field  of  software  cost  estimation  has  focused  on  full-scale  develop- 
ment, while  maintenance  has  been  undeservedly  neglected.  Recently,  how- 
ever, some  preliminary  work  has  been  done  on  maintenance  cost  estimation. 
This  work  will  be  discussed  briefly  following  a more  complete  survey  of 
the  state-of-the-art  in  the  full-scale  development  area. 

Methods  of  Estimating  Software  Development  Costs 

Although  there  is  little  agreement  in  the  literature  on  the  specific 
cost  mode  which  should  be  used  to  estimate  software  development  costs, 
there  does  seem  to  be  general  agreement  on  the  component  methods  which  may 
be  used.  In  general,  most  software  cost  estimating  models  combine  at 
least  some  of  the  following  methods:  decomposition,  historical  analogy, 
parametric  equations,  and  unit  of  work.  Other  methods,  such  as  the  use 
of  resource  constraints  are  not  so  much  cost  estimating  methods  as  upper 
bounds  for  "design  to  cost"  type  situations. 

Decomposition  is  simply  a method  of  breaking  the  problem  Into  pieces 
so  that  each  may  be  estimated  separately.  It  is  often  used  to  estimate 
the  total  number  of  instructions  or  modules  in  a proposed  system.  Alterna- 
tively, decomposition  may  be  used  to  break  the  system  into  Individual 
"work  packages,"  which  may  be  estimated  separately.  In  any  case,  decompo- 
sition almost  always  requires  assumptions  about  the  structure  of  the 
proposed  system.  (Clapp,  1976:15) 


o 


; 


If  there  Is  a universally  recognized  method  of  estimating  software 
cost,  It  Is  historical  analogy.  In  this  method,  the  estimate  Is  based  on 
the  historical  cost  experience  associated  with  one  or  more  similar  pro- 
jects. If  no  similar  experience  exists,  the  proposed  system  Is  decomposed 
to  the  point  where  analogies  may  be  drawn  for  each  component.  The  advan- 
tages of  this  method  are  simplicity  and  potential  accuracy  when  a good 
analogy  can  be  found.  On  the  other  hand,  there  Is  a serious  danger  of 
making  Invalid  analogies,  and  the  cost  of  maintaining  and  searching  the 
required  historical  records  must  be  considered.  (Nelson,  1966;  Aron, 

1969;  Smith,  1975;  Clapp,  1976) 

Most  of  the  recent  work  on  software  cost  estimation  has  revolved 
around  the  search  for  reliable  parametric  equations.  Such  equations 
relate  cost  to  other  parameters  which  describe  the  system,  and  which  are 
"more  easily"  estimated.  The  most  common  type  of  parametric  equation 
used  for  software  development  Is  based  on  the  number  of  Instructions  In 
the  proposed  system.  This  method  is  extremely  fast  and  simple  to  apply. 

It  Is  also  only  as  accurate  as  the  available  equations  and  the  estimates 
of  the  input  parameters.  (Clapp,  1976:16-17) 

The  final  basic  method  of  estimating  software  cost  Is  the  unit  of 
work  method.  In  this  case,  the  task  of  building  the  system  Is  decomposed 
Into  discrete  "work  packages."  Each  work  package  must  be  small  enough  to 
be  accurately  estimated.  Aron  mentions  a limit  of  eight  weeks  work  for 
one  prograamer.  (Aron,  1969:8)  The  system  development  cost  is  then  the 
sum  of  the  costs  for  each  work  package.  Both  Aron  and  Smith  maintain  that 
this  method  works  well  for  small  systems,  but  breaks  down  for  large  soft- 
ware development  projects  with  their  complex  Interactions  between  both 
programs  and  people.  (Aron,  1969:8-9;  Smith,  1975:2-18) 


i 

i 


..... 


O 


16 


The  Measurement  Problem.  Before  proceeding  to  a discussion  of  the 


factors  which  affect  software  development  costs,  It  Is  necessary  to  dis- 
cuss the  way  In  which  software  development  cost  Is  measured.  Unfortunately 
there  Is,  at  present,  no  concensus  as  to  what  should  be  Included  as  a 
relevant  cost  of  software.  This  problem  Is  especially  acute  when  the 
software  development  project  Is  an  Integral  part  of  a larger  effort.  In 
such  cases,  systems  oriented  functions  may  be  performed.  In  large  part, 
by  central  facilities  whose  costs  are  not  directly  allocable  to  software 
development.  Such  situations,  as  well  as  normal  differences  In  accounting 
procedures  between  organizations  Introduce  a wide  variability  Into  re- 
corded software  costs  that  Is  Independent  of  actual  resource  consumption. 

This  problem  seems  to  be  more  serious  for  government  cost  analysts 
than  for  their  counterparts  In  Industry.  This  Is  because  each  company 
may.  If  It  chooses,  standardize  Its  own  cost  accounting  procedures  and 
cost  estimating  techniques.  The  government,  on  the  other  hand,  must 
accept  software  cost  data  from  a wide  range  of  Individual  accounting  sys- 
tems If  It  Is  to  establish  a comprehensive  cost  data  base.  Likewise, 
government  cost  analysts  must  be  able  to  work  with  proposals  based  on 
different  accounting  assumptions.  Because  of  this  problem,  a number  of 
proposals  have  been  made  which  would  set  uniform  standards  for  software 
costs  reported  to  the  government.  The  most  recent  of  these  Is  the  excel- 
lent report  by  Graver  which  contains  a detailed  treatment  of  the  problems 
and  Issues  associated  with  attempts  to  standardize  software  cost  reporting. 
(Graver,  1977) 

Factors  Affecting  Software  Development  Costs.  Measurement  problems 
aside,  one  of  the  major  reasons  why  software  development  costs  are  so 

17 





' M 


. . » k.  : -...1—. 


. .. 


difficult  to  estimate  Is  the  large  number  of  factors  which  have  a slgnlfl- 
O cant  effect  on  them.  Researchers  In  this  area  have  consistently  found 

that  at  least  40  different  factors  can  significantly  affect  the  cost  of 
software  development.  (Melnwurm,  1965;  Heard*  1977)  To  Illustrate  the 
extent  of  this  problem.  Table  I lists  the  factors  which  were  found  to  be 
significant  In  a recent  study  by  Herd. 

Despite  the  large  number  of  significant  factors,  recent  work  has 
concentrated  on  only  a few  of  them.  Of  these,  the  single  most  signifi- 
cant factor  Is  the  size  of  the  software  system.  Unfortunately,  the 
precise  relationship  between  size  and  cost  Is  still  being  debated  in  the 
literature. 

The  first  problem  associated  with  attempts  to  relate  system  size  to 
development  cost  Is  measuring  size.  The  most  common  solution  Is  to  use 
the  number  of  delivered  object  Instructions  as  the  basic  measure  of  size. 
This  is  usually  justified  as  an  attempt  to  minimize  the  effects  of  dif- 
ferent programming  languages  on  size.  (Graver,  1977:6-8)  To  date.  Herd's 
group  has  published  the  most  extensive  and  conceptually  valid  investiga- 
tion of  the  relationship  between  system  size  and  development  cost  In  the 
public  domain.  The  results  of  his  regressions  show  nearly  identical 
values  for  R2  and  standard  error  when  two  different  measures  of  size 
were  analyzed  In  conjunction  with  the  same  cost  data.  (Herd, 1977)  At 
present,  consistency  In  using  a particular  definition  of  size  seems  to  be 
more  Important  than  the  particular  definition  chosen. 

Once  a measure  of  size  has  been  chosen,  a functional  form  must  be 
selected  to  relate  It  to  cost.  Historically,  two  functional  forms  have 
been  used,  the  linear  and  the  exponential.  The  linear  form  Is: 


O 


18 





High  significant  Impact 
Medium  significant  Impact 
Low  significant  Impact 
Negligible  significant  Impact 


C ■ aS 


where:  C * a measure  of  cost  (usually  either  dollars  or  man-months) 

S » a measure  of  size  (usually  either  number  of  object  or 
source  Instructions) 

a « a constant  (usually  estimated  using  linear  regression) 

While  the  Exponential  form  may  be  written: 

C » aSb 

where:  C,  S,  and  a are  as  above, 

b « a constant  exponent  (also  usually  the  result  of  some 
form  of  regression) 

Of  course  the  linear  form  Is  a special  case  of  the  exponential  form. 

The  prime  arguments  for  the  linear  form  seem  to  be  simplicity  and 
compatibility  with  standardized,  multivariate  linear  regression  programs. 

These  arguments  are  not  to  be  discounted,  since  they  do  allow  other 
factors  to  be  considered  along  with  size.  As  long  as  there  Is  a strong 
conceptual  argument  for  the  exponential  form,  however,  any  results  that 
depend  on  the  linear  form  must  remain  open  to  question. 

The  argument  In  favor  of  the  exponential  form  can  be  traced  back 
at  least  as  far  as  Farr  and  Nanus*  work  In  1964.  (Farr,  1964:13)  The 
most  prominent  proponent  of  this  viewpoint,  however,  has  been  Brooks, 
who  makes  a strong,  conceptual  arguaent  In  favor  of  an  exponential 
relationship.  In  essence.  Brooks  argues  that  as  the  size  of  the  system 
development  team  grows,  the  overhead  associated  with  Intra-team  coordina- 
tion Increases  exponentially.  For  large  systems,  this  conmunl cation 

21 

. ...... — .-.1 — - -1-  — 


problem  becomes*  according  to  Brooks,  the  dominant  Influence  on  develop* 
ment  cost.  (Brooks,  1975) 

Until  very  recently,  the  available  data  concerning  the  proper  value 
of  the  exponent  was  Inconclusive.  Recently,  however.  Herd  has  pub* 

11  shed  an  empirical  study  of  software  cost  estimation  which  sheds  con* 
slderable  light  on  this  question.  This  study  has  two  distinct  advan- 
tages over  previous  work.  First,  the  data  base  used  (129  cases  after 
deletions)  Is  the  largest  that  this  author  Is  familiar  with.  (Herd, 
1977:122)  In  addition,  this  Is  the  only  study  that  this  author  has 
seen  which  used  nonlinear  regression  techniques. 

Most  studies  of  the  exponential  relationship  have  used  " log-11 near" 
regression  techniques.  According  to  Draper  and  Smith,  this  approach 
Is  only  valid  If  the  logarithm  of  the  error  term  Is  normally  distrib- 
uted. (Draper,  1966:132)  In  the  case  of  software  cost  data,  this 
does  not  appear  to  be  a valid  assumption.  In  addition,  the  magnitude 
of  the  errors  Is  large  enough  to  significantly  distort  the  "fit." 

(Herd,  1977:123) 

Herd  divided  his  sample  Into  four  types  of  systems:  comnand  and 
control,  scientific,  business,  and  utility.  He  then  used  nonlinear 
regression  techniques  to  solve  for  the  constant  parameters.  Table  II 
summarizes  the  results  of  this  analysis.  As  the  table  Indicates,  the 
exponents  calculated  vary  considerably  for  different  types  of  systems. 

Herd's  results  are  Interesting,  but  still  not  conclusive.  For  one 
thing,  the  size  of  the  data  sample  Is  still  quite  small  considering 
the  dispersion  of  the  data.  Another  problem  Is  that  the  largest  system 
Included  In  the  sample  apparently  represented  less  than  60,000  source 


22 


Exponential  Relationships  Between  Development  Effort  and  System  Rate 


♦Hot  reported 


Regression  Model:  MM  ■ 

alb 

Where: 

MM  - Total  man-months  for  system  development 

I ■ Total  Instructions  In  system 

a.b  ■ Constants  calculated  using  nonlinear  regression 

System 

Type  of 

Standard 

Type 

Instructions 

a 

b 

R* 

Error 

Command 

OBJECT 

4.S7 

1.29 

.78. 

41.1 

and 

e 

> 

Control 

SOURCE 

4.09 

1.26 

.80 

41.1 

OBJECT 

4. SO 

1.07 

.74 

72.1 

Scientific 

SOURCE 

7.05 

1.02 

.78 

72.1 

OBJECT 

2.90 

.78 

.48 

12.4 

Business 

SOURCE 

4.50 

.78 

.61 

10.7 

OBJECT 

12.04 

.72 

.30 

58.1 

Utility 

SOURCE 

10.08 

.81 

.45 

51.7 

OBJECT 

4.79 

.99 

* 

62.2 

All 

SOURCE 

5.26 

1.05 

* 

50.7 

lines,  while  nost  of  the  data  represented  much  smaller  systems.  Thus, 
Herd's  results  may  not  adequately  reflect  "large"  software  development 
projects. 


o 

Even  so.  Herd's  analysis  aoes  seem  to  be  the  most  extensive  and 
conceptually  correct  available.  Despite  Its  limitations.  It  presents 
strong  evidence  that  diseconomies  of  scale  (large  systems  cost  more  per 
| Instruction)  may  exist  for  at  least  some  classes  of  software  system 

developments.  Such  evidence  should  not  be  Ignored,  especially  consider* 
Ing  the  large  number  of  small  systems  represented  In  the  sample. 

Unfortunately,  size  alone  Is  Insufficient  to  explain  even  half  of 
the  variance  In  software  development  cost.  (Herd,  1977)  Some  of  the 
other  factors  which  have  received  significant  attention  In  the  litera- 
ture Include  application  mix,  difficulty,  schedule  constraints,  and 
source  language.  Each  of  these  Is  discussed  briefly  below. 

There  Is  evidence  that  some  types  of  applications  are  easier  to 
code  than  others.  Herd's  classification  of  systems  Into  four  types 
(comnand  and  control,  scientific,  business  and  utility)  represents  one 
method  of  modeling  this  effect.  His  results  seem  to  support  the 
hypothesis  that,  however  he  discriminated  between  types*  there  were 
significant  differences  between  them.  (Herd,  1977)  This,  however.  Is 
not  the  only  method  of  modeling  the  effects  of  differences  between 
applications. 

Molverton  achieves  essentially  the  same  effect  by  decomposing 
each  system  Into  six  categories  of  routines.  His  categorization 
Includes:  control,  pre  or  post  algorithm  processors,  logical  or  math- 
ematical algorithms,  data  management,  and  time  critical  processors. 


O 


24 


(Wolverton,  1972:9)  This  approach  Is  based  on  the  assumption  that  dif- 
ferent productivity  rates  are  Inherently  associated  with  different 
types  of  routines.  Under  this  assumption,  different  "kinds"  of  systems 
would  be  typified  by  different  "mixes"  of  these  "application  types." 

Thus,  Herd's  findings  are  not  Incompatible  with  this  approach. 

Uolverton's  approach  has  several  distinct  advantages  over  the 
method  used  by  Herd.  First,  It  should  be  possible  to  define  rather 
simple  decision  rules  for  categorizing  Individual  routines.  Given  these 
rules,  any  system  can  be  described.  On  the  other  hand.  It  Is  not  hard 
to  Imagine  the  other  approach  leading  to  continuing  classification 
problems.  The  other  major  advantage  of  Holverton's  approach  Is  that  It 
Is  essentially  numerical  In  nature,  and  thus  quite  compatible  with 
quantitative  analysis.  The  other  approach  Is  more  qualitative  In 
nature,  and  thus  harder  to  analyze. 

Even  allowing  for  different  application  mixes,  the  fact  remains 
that  some  programming  tasks  are  more  difficult  than  others.  Even  though 
measuring  difficulty  Is  Inherently  a subjective  process,  differences 
In  difficulty  must  be  accounted  for.  Aron  has  proposed  a method  of 
measuring  difficulty  which  uses  three  discrete  levels:  easy,  medium, 
and  hard.  According  to  this  categorization,  "easy"  programs  have  very 
few  Interactions  with  other  system  elements.  "Medlimi"  programs,  on  the 
other  hand,  not  only  have  more  Interactions  with  other  system  elements, 
they  are  also  characterized  by  the  ability  to  solve  general  classes  of 
problems,  (e.g.,  language  compilers).  Finally,  "hard"  programs  Interact 
with  many  system  elements.  They  are  typified  by  system  monitors  and 
operating  systems.  (Aron,  1969:12-13) 


i 

•■r 


Wol verton  has  proposed  that  an  additional  factor  be  taken  Into 
account  when  attempting  to  measure  difficulty.  In  his  opinion,  "new" 

(or  unfamiliar)  applications  are  harder  than  "old"  (or  familiar)  appli- 
cations. Thus,  Mol verton  advocates  the  use  of  six  categories,  com- 
bining Aron's  three  classifications  with  the  prefix  "old"  or  "new." 

(Wol verton,  1972:13-14) 

Another  factor  which  has  a significant  effect  on  software  develop- 
ment cost  Is  schedule.  Unquestionably,  the  most  eloquent  treatment  of 
the  relationship  between  cost  and  schedule  Is  Brooks'  classic  essay 
"The  Mythical  Man-Month."  His  discussion  Is  based  on  the  premise  that 
manpower  and  schedule  are  directly  Interchangeable  only  Insofar  as  the 
tasks  Involved  are  Independent  of  each  other.  If  the  tasks  Involved 
are  Inherently  sequential,  however,  adding  manpower  cannot  reduce  the 
time  required  to  complete  them.  In  fact.  Brooks  argues  convincingly 
that  the  assignment  of  too  many  people  to  a development  organization  Is 
counterproductive,  because  of  the  Increasing  level  of  coordination 
required  of  each  Individual.  The  result  of  this  argument  Is  Brooks' 

Law:  "Adding  manpower  to  a late  software  project  makes  It  later." 
(Brooks,  1975) 

Unfortunately,  little  empirical  work  has  been  publslhed  on  the  rela- 
tionship between  cost  and  schedule.  Putnam  has  developed  a software 
cost  model  which  Incorporates  some  of  the  relationships  required  by 
Brooks'  theory.  This  model  Is,  however,  derived  from  a theoretical, 
rather  than  an  empirical  base.  (Putnam,  1969)  Herd's  group  addressed  the 
relationship  between  system  size  and  development  time,  but  could  find 
no  quantitative  data  with  which  to  measure  the  cost  Impact  of  deviation 
from  an  optimum  schedule.  (Herd,  1977:43-45)  Graver's  group,  on  the 


26 


( 


Tl 

n 


5 ! 


o 


other  hand,  attempted  a truly  unique  analysis  of  the  Interrelationships 
between  phases  of  development.  Unfortunately,  lack  of  data  severely 
limited  the  useful Iness  of  their  results.  Still,  their  approach  should 
be  of  value  to  future  Investigators.  (Graver,  1977) 

The  final  factor  affecting  software  development  cost  which  will  be 
discussed  here  Is  programing  language.  Developments  which  use  higher 
order  languages  (HOL)  such  as  FORTRAN  are  generally  felt  to  be  less 
expensive  than  similar  developments  which  use  machine  oriented  languages 
(MOL).  First,  each  source  statement  In  a HOL  program  Is  expanded  Into 
several  machine  Instructions,  while  MOL  statements  generally  correspond 
directly  to  Individual  machine  instructions.  In  addition.  It  Is  often 
argued  that  systems  written  In  a HOL  are  easier  to  read  and  understand. 
(Graver,  1977:6-10) 

Graver's  group  has  recently  investigated  the  effects  of  language  on 
development  cost.  Using  log-linear  regression,  they  found  that  develop- 
ments using  MOL's  tended  to  require  about  twice  as  many  total  man-months 
as  developments  that  used  HOL’s.  The  statistical  confidence  level  cal- 
culated for  this  relationship  was  not  high  enough  (0.8),  however,  to 
permit  a definitive  conclusion  to  be  drawn.  Graver  does  report,  however, 
that  these  results  agree  with  the  results  of  some  controlled  experiments 
with  computer  languages.  (Graver,  1977:6-9,  16) 

The  above  discussion  has  briefly  touched  on  some  of  the  more  signif- 
icant factors  which  affect  the  development  cost  of  software  systems. 
Although  full-scale  development  Is  usually  the  most  costly  phase  of  the 
software  system  life-cycle,  the  cost  of  maintenance  Is  often  significant. 
Brooks,  for  example,  states  that,  "The  total  cost  of  maintaining  a 


fl 


; i 

1-3 

i ; 


1 


27 


o 


widely  used  program  Is  typically  40  percent  or  more  of  the  cost  of 
developing  It."  (Brooks,  1975:121) 

The  Cost  of  Maintenance.  Graver  points  out  that  software  mainten- 
ance has  two  components.  These  will  be  referred  to  as  "error  correction" 
and  "system  enhancement."  It  Is  Graver's  position  that  only  the  error 
correction  component  of  maintenance  costs  should  be  Included  In  life- 
cycle cost  estimates.  This  position  Is  based  on  the  argument  that 
enhancement  type  modifications  are  concerned  with  product  improvement 
rather  than  success  or  failure  in  meeting  original  product  requirements. 
(Graver,  1977:6-33) 

Putnam  does  not  address  this  point  explicitly.  Some  statements  in 
his  article,  however,  seem  to  Imply  that  he  intends  to  model  only  error 
correction  maintenance  costs.  The  continuously  decreasing  form  of  his 
resource  utilization  function  reinforces  this  impression.  (Putnam) 

There  is  little  question  that  the  costs  of  error  correction  main- 
tenance should  be  included  in  any  life-cycle  cost  estimate.  In  addition, 
some  enhancement  decisions  are  clearly  independent  of  the  original 
system  development.  In  other  cases,  however,  the  situation  is  not  so 
clear-cut. 

In  the  opinion  of  this  author,  some  system  enhancement  maintenance 
activity  is  the  direct  result  of  management  decisions  made  before  or 
during  development.  As  such,  the  costs  of  these  enhancements  should  be 
included  in  system  life-cycle  cost  estimates.  Two  examples  of  this 
include  "bare-bones"  systems  and  systems  which  must  exist  in  a dynamic 

environment.  

There  are  a number  of  reasons  why  management  may  decide  to  procure 
a "bare-bones"  system  (one  which  has  only  minimal  capability).  In  some 

28 


IBkttimBflMUflHnainHAHfllMIPH 


o 


cases,  this  may  be  done  to  Insure  early  system  availability.  In  others. 

It  may  not  be  clear  what  the  mature  configuration  of  a radically  innova- 
tive system  should  be.  In  either  of  these  cases,  if  current  management 
decisions  are  based  on  the  assumption  that  a bare-bones  system  will  be 
enhanced,  the  costs  of  that  enhancement  should  be  Included  In  the  life- 
cycle  cost  estimate. 

Another  case  where  enhancement  costs  are  relevant  to  life-cycle 
cost  estimates  Involves  systems  which  must  meet  changing  user  needs  in 
a dynamic  environment.  One  example  of  such  a system  would  be  a scien- 
tific system  which,  from  its  inception,  is  intended  to  continuously 
reflect  the  state-of-the-art.  Another  example  would  be  a military 
command  and  control  system.  In  many  cases,  the  initial  assessment  of  the 
wartime  environment  In  which  such  systems  will  operate  is  already  obso- 
lete when  the  system  becomes  operational.  In  such  cases,  continuing 
enhancements  to  the  software  are  necessary  to  ensure  the  continued 
viability  of  the  weapon  system  in  a wartime  environment.  Thus,  life- 
cycle  cost  estimates  should  include  the  cost  of  such  enhancements. 

It  is  the  opinion  of  this  author  that  the  maintenance  portion  of 
software  life-cycle  cost  estimates  should  contain  two  components.  The 
first  Is  the  cost  of  error  correction,  as  recommended  by  Graver.  The 
second  Is  the  cost  of  those  nondiscretionary  enhancements  that  are 
Impllclty  In  the  assumptions  Inherent  in  the  estimate.  To  date,  empiri- 
cal research  on  maintenance  costs  has  concentrated  almost  exclusively  on 
the  cost  of  error  correction,  and  even  that  has  been  minimal. 

Graver's  group  Investigated  three  questions  concerning  the  cost  of 

a ft  4 A ^ ,l'U;  t*,»  $ ■ *1 1’jr,  | * f>  Y I f*!*  4 ?V»:  -If 

software  maintenance,  using  data  from  two  large,  ground  based  command 


o 


29 


and  control  systems.  In  the  first  case,  they  definitively  established 
that  the  total  number  of  problems  reported  for  a system  was  related  to 
the  number  of  design  changes  and  error  corrections  that  had  previously 
been  Incorporated  Into  the  system.  (Graver,  1977:6-38)  In  fact,  their 
results  Indicate  that  Brooks  may  have  been  optimistic  when  he  estimated 
that  each  error  correction  has  a 20  to  50  percent  probability  of  intro- 
ducing another  error.  (Brooks,  1975:122) 

Graver  also  attempted  to  show  that  validation  and  verification 
(V&V)  efforts  were  effective  In  reducing  system  errors.  Unfortunately, 
this  analysis  was  flawed.  In  that  It  was  based  on  the  number  of  errors 
found  through  V&V,  rather  than  on  the  number  of  errors  which  escaped 
detection  by  V&V  groups.  (Graver.  1977:6-38,  40) 

The  last  maintenance  oriented  question  addressed  by  Graver's  group 
dealt  with  the  effect  of  programming  language  of  the  number  of  personnel 
required  to  maintain  a particular  system.  They  present  convincing  evi- 
dence that  all  other  things  being  equal,  it  takes  approximately  four 
times  as  many  programmers  to  maintain  a MOL  program,  than  a similar  pro- 
gram written  In  a HOL.  Thus,  the  claims  that  HOL's  reduce  overall  main- 
tenance costs  seems  to  be  justified.  (Graver,  1977:6-40,  42) 


An  Introduction  to  PRICE  S 


PRICE  S Is  a software  development  cost/schedule  estimation  model 


that  is  currently  beino  developed  bv  RCA  PRICE  Systems,  Inc.  It  Is 


based  on  the  same  parametric  cost  modeling  methods  that  were  used  to 
develop  their  highly  successful  PRICE  model  for  hardware  cost/schedule 
estimation.  A complete  description  of  PRICE  S Is  beyond  the  scope  of 
this  report.  Instead,  a brief  description  of  the  PRICE  S Inputs, 


I 


u 


outputs*  and  operating  envl ronment  must  sufflca.  This  discussion  Is 
based  on  a two  day  Introductory  course  on  PRICE  S which  was  held  at  the 
RCA  facility  at  Moorestown.  New  Jersey  on  8 and  9 May  1977. 

PRICE  S Inputs  may  be  divided  into  three  distinct  categories: 

"hard"  system  parameters,  "soft"  system  parameters*  and  "environmental* 
parameters.  A full  list  of  PRICE  S Input  parameters  Is  contained  In 
Appendix  A. 

The  environmental  parameters  describe  the  environment  within  which 
the  system  will  be  developed.  They  address  such  factors  as  escalation 
(price  Inflation)  and  rate  of  technological  Improvement.  These  param- 
eters. permit  users  to  vary  the  economic  assumptions  that  thqy  use  In 
comparing  historical  data  to  the  present.  The  present  research  does 
not  address  these  parameters,  except  to  use  RCA's  pre-set  values  without 
question.  They  will  not  be  discussed  further  in  this  report. 

The  "hard"  Input  parameters  represent  things  which  can  be  physically 
measured  In  the  completed  system.  They  Include  such  Items  as  total  nun- 
ber  of  executable  machine  Instructions  and  the  proportion  of  the  total 
Instructions  that  represent  each  of  seven  different  "application  types." 
While  they  may  or  may  not  be  well  defined  at  the  time  a cost  estimate 
Is  made,  historical  data  can  be  used  to  generate  firm  measures  of  these 

quantities  for  any  completed  system. 

The  "soft"  Input  parameters,  on  the  other  hand,  are  more  subjective 
In  nature,  and  cannot  be  measured  directly  from  historical  data.  PRICE 
S uses  three  "soft"  parameters  that  are  relevant  to  the  present  research. 
Software  Design  Complexity  (SDCPLX)  Is  a measure  of  the  "Inherent  diffi- 
culty" of  the  system  development  task.  Theoretically,  It  relates  the 


31 


difficulty  of  the  task  to  the  shop  that  Is  to  accomplish  the  task. 

O Engineering  Complexity  (ENCPLX),  on  the  other  hand.  Is  a measure  of  how 

long  the  project  "should"  take.  It  seems  to  be  based  on  the  assumption 
that  there  is  a "natural"  project  length  that  Is  characteristic  of  any 
system  development  project.  Finally,  Platform  (PLTFM)  Is  a measure  of 
the  reliability  that  will  be  required  of  the  system.  It  seems  to  be 
related  to  the  stringency  of  the  system  specifications. 

The  outputs  of  the  PRICE  S model  (In  normal  mode)  Include  cost  and 
schedule  estimates.  The  cost  estimates  are  broken  out  In  matrix  form. 

The  columns  of  this  matrix  are  the  three  phases  of  the  software  develop- 
ment process:  design.  Implementation,  and  test  and  Integration.  The 
rows  are  the  five  productive  activities  described  above:  systems 
engineering,  programing,  configuration  management,  documentation,  and 
program  management.  The  schedule  estimates  that  are  output  Include  begin 
and  end  times  (In  months)  for  each  of  the  three  phases  mentioned  above. 

In  addition  to  the  normal  mode  outputs,  PRICE  S may  be  run  "backwards" 
with  historical  cost  and  schedule  Information  as  Input.  In  this  mode, 
the  model  may  be  used  to  estimate  values  for  SDCPLX  and  ENCPLX. 

One  unique  characteristic  of  the  cost  estimation  models  developed 
by  RCA  PRICE  Systems,  Inc.  Is  the  concept  of  "boxing."  Under  this  con- 
cept, each  component  of  a system  may  be  described  and  estimated  Individ- 
ually as  a separate  "black  box."  The  user  may  then  estimate  total 
system  cost  by  using  the  output  from  each  separate  component  estimation 
as  Input  to  a combined  "system"  estimate.  The  model  automatically 
accounts  for  the  additional  "systems-level"  work  that  would  be  required 
to  design  and  Implement  the  separate  components  as  part  of  a coherent 
system  Instead  of  Individually. 


32 


PRICE  S has  been  Implemented  In  a commercial  time-sharing  environ- 
C ) ment.  Users  contract  with  RCA  PRICE  Systems,  Inc.  for  the  use  of  the 

A 

program.  This  contract  entitles  them  to  access  the  program  by  "phoning" 
the  time-sharing  facility  using  a standard  teletype  terminal.  In  addi- 
tion to  the  PRICE  S model  Itself,  the  user  also  has  access  to  a rela- 
tively sophisticated  utility  program  which  allows  him  to  build,  edit, 
and  manipulate  PRICE  S Input  and  output  files. 


o 


III.  METHODOLOGY 


The  Research  Problem 

Stateme nt  of  the  Problem.  The  problem  addressed  by  this  research 
Is  the  need  for  a workable,  accurate  method  of  making  preliminary  soft* 
ware  cost  estimates  at  Aeronautical  Systems  Division  (ASD). 

Conversations  with  ASD  cost  analysts  during  the  formative  stages 
of  the  research  have  led  this  author  to  believe  that  preliminary  budget 
estimates  of  software  costs  require  the  analyst  to  draw  heavily  on  his- 
torical analogy.  Even  when  parametric  equations  are  used,  analogy  Is 
required  to  estimate  the  size  and  difficulty  of  the  system.  The  ability 
to  draw  analogies  depends  directly  on  the  availability  of  descriptive, 
historical  data. 

Purpose.  The  purpose  of  the  research  Is  to  Investigate  ways  of 
gathering  and  using  descriptive  data  for  the  purpose  of  making  prelimin- 
ary software  cost  estimates. 

The  orientation  of  the  research  toward  the  PRICE  S model  occurred 
primarily  because  several  ASD  cost  analysts  expressed  an  Interest  In  the 
model.  Further  Investigation  revealed  that  Air  Force  Avionics  Laboratory 
(AFAL)  personnel  were  also  Interested  In  PRICE  S.  Through  a fortunate 
coincidence.  It  was  found  that  RCA  PRICE  Systems,  Inc.  was  about  to 
field  test  the  program  by  allowing  a group  of  prospective  customers  to 
use  It  on  a trial  basis  at  no  cost.  On  the  understanding  that  such 
participation  In  no  way  constituted  an  endorsement  of  PRICE  S by  the 
government,  this  author  was  allowed  to  participate  In  the  test  program. 


Objectives.  Two  specific  research  objectives  were  chosen.  The 
first  objective  was  to  develop  a methodology  for  gathering  descriptive 
data  on  software  systems.  The  second  was  to  use  historical  ASD  soft* 
ware  acquisition  data  to  "calibrate"  the  PRICE  S model. 

The  term  "calibrate"  can  be  used  In  two  ways  within  a cost  esti- 
mating context.  ASD  cost  analysts  speak  of  a cost  analyst  "calibrating 
himself"  to  a particular  cost  model.  By  this,  they  mean  that  the 
analyst  has  learned  to  choose  Input  parameters  for  the  model  that  result 
In  accurate  cost  estimates.  Thus,  a "calibrated"  cost  analyst  Is  able 
to  correctly  evaluate  and  adjust  for  subjective  factors  and  model  de- 
ficiencies In  order  to  arrive  at  an  accurate  estimate.  Although  It  Is 
hoped  that  this  research  will  contribute  to  the  "calibration"  of  future 
ASD  analysts,  the  specific  objective  of  the  research  Is  to  calibrate 
PRICE  S In  a different  sense. 

According  to  Webster's  Seventh  New  Collegiate  Dictionary,  calibrate 

means,  "to  determine,  rectify,  or  mark  the  gradations  of ."  The 

"soft*  PRICE  S Input  parameters  described  above  all  use  a numerical 
scale,  but  the  Individual  values  on  these  scales  have  no  Inherent  meaning. 
It  Is  only  by  associating  known  characteristics  of  real  systems  to  par- 
ticular values  of  these  parameters  that  these  values  take  on  meaning. 

It  Is  In  this  sense  of  "marking"  Individual  values  of  the  "soft*  param- 
eters by  association  with  observed  fact  that  the  term  calibration  Is  used. 

Scope  and  Limitations.  The  research  applies  to  operational  flight 
software  systems  procured  by  ASD  System  Program  Offices  (SPO's).  Thus, 
only  software  systems  that  were  actively  Involved  In  the  flight  and/or 
mission  performance  of  aircraft  In  real-time  ware  Included.  As  a result. 


i 

1 

- 

i 

■l 

\ 


1 


35 


any  application  of  the  results  of  this  research  to  other  types  of  soft- 
ware or  other  organizations  should  be  done  with  caution. 

Assumptions.  The  research  Is  based  on  the  following  assumptions: 
Cost/Schedule  Information  reported  by  contractors  to  SPO's  Is  assumed  to 
be  a "reasonable  estimate"  of  the  true  cost  of  the  Item  Involved.  The 
validity  of  the  PRICE  S model  Is  assumed.  Thus,  It  Is  assumed  that  the 
model  accurately  reflects  the  effects  of  the  various  Input  parameters 
on  software  development  cost  and  schedule. 

Data  Collection 

The  research  objective  of  developing  a methodology  for  gathering 
descriptive  data  on  software  was  addressed  through  the  data  collection 
effort  Itself.  This  effort  was  designed  to  accomplish  three  objectives. 
First,  it  was  used  to  screen  prospective  systems  and  eliminate  those 
that  were  unsuitable  for  further  study.  Next,  the  data  collection 
effort  had  to  provide  sufficient  Information  to  permit  PRICE  S model 
Inputs  to  be  generated.  Finally,  the  effort  was  designed  to  acquire 
sufficient  additional  Information  to  put  the  system  In  perspective.  As 
It  finally  evolved,  the  data  collection  effort  consisted  of  three  parts: 
an  Initial  Interview,  cost/schedule  data  collection,  and  statistical 
data  collection. 

The  Initial  Interview.  A formal  Interview  format,  using  both 
closed  and  open-andeo  questions  was  developed.  This  proved  to  be  an 
effective  screening  devlte  as  well  as  a tool  for  gathering  background 
Information  about  individual  systems.  Because  of  the  limited  number  of 
prospective  systems.  It  was  not  feasible  to  formally  validate  the  Inter- 
view format.  Instead,  several  knowledgable  Individuals  were  consulted 


1 


i 


O 


36 


\ 

i 

i 


and  their  advice  used  In  Modifying  the  format.  In  addition,  some  ques- 
tions were  reworded  when  field  experience  showed  them  to  be  awkward  or 
misleading.  Because  of  the  Informal  and  free-flowing  nature  of  the 
Interviews,  It  Is  believed  that  these  modifications  did  not  affect  the 
Information  gathered  In  any  way.  The  Interview  format  Is  contained  In 
Appendix  B. 

Because  only  a small  number  of  the  prospective  systems  were  able 
to  pass  the  screening  criteria.  It  Is  Important  to  report  what  these 
criteria  were.  First,  only  operational  flight  software  systems  were 
considered.  Of  these,  only  those  written  In  an  assembly  language  were 
acceptable.  This  Is  because  at  present,  PRICE  S Is  not  oriented  toward 
HOL's.  Next,  the  cost  of  the  system  development  had  to  be  efeslly 
derivable  from  the  official  reports  provided  to  the  SPO  by  the  contrac- 
tor. In  general,  this  meant  that  software  development  was  either  a 
separate,  reportable  "line  Item"  on  the  contract,  or  that  the  entire 
contract  was  for  software  system  development.  Finally,  the  system  had 
to  be  at  a stage  of  development  where  software  development  costs  were 
known  with  a high  degree  of  accuracy.  The  worst  case  that  was  accepted 
Involved  a system  that  was  well  Into  formal  qualification  testing  with 
few  apparent  problems.  Only  two  systems  were  able  to  meet  these  cri- 
teria. 

Prospective  systems  were  Identified  and  the  Initial  Interviews  were 
arranged  through  the  Information  Engineering  Branch  of  the  Directorate 
of  Avionics  Engineering  at  ASD  (ASD/ENAI).  This  organlatlon  provides 
software  engineering  support  to  all  ASD  SPO's  with  ongoing  avionics 
software  development  projects.  The  Initial  Interview  was  normally  held 
with  the  responsible  software  engineer  In  order  to  qualify  the  system 


37 


before  other  SPO  personnel  were  epproached.  If  the  system  looked 
O promising,  the  responsible  project  officer  was  approached  to  collect 

the  appropriate  cost/schedule  Information. 

Cost/Schedule  Data  Collection.  Cost  and  schedule  data  proved  to 
be  the  most  difficult  Information  to  collect.  Unfortunately,  until 
very  recently  computer  software  was  not  generally  recognized  as  a 
spearate  components  of  avionics  systems.  As  a result,  for  most  of  the 
systems  Investigated,  software  development  costs  were  simply  not  avail- 
able. These  systems  were  dropped  from  further  consideration.  In  those 
cases  where  software  development  costs  were  reported  separately,  several 
adjustments  were  made  to  attempt  to  make  the  reported  costs  more 
"realistic." 

First,  any  separable  nonsoftware  costs  were  subtracted  from  the 
Initial  total.  Of  course,  any  Identifiable  costs  that  were  directly 
allocable  to  software  development  were  added  to  the  total.  In  order  to 
maintain  consistency.  It  was  decided  to  Include  "burden"  (factory  over- 
head) while  excluding  general  and  administrative  (GIA)  costs  as  well  as 
fee  or  profit.  The  rationale  for  this  decision  was  that  the  cost  of 
work  space  and  vital  services  Is  often  a major  component  of  burden, 
while  GAA  costs  depend  on  the  upper  echelons  of  the  organization  rather 
than  on  the  software  development  team  Itself.  Profit,  of  course.  Is 
not  normally  Included  In  cost  estimates. 

The  major  exception  to  the  above  policy  was  the  treatment  of 
"supporting  contracts"  such  as  Independent  Verification  and  Validation 
(IVtV)  contracts,  or  small  research  contracts  that  related  directly  to 
the  software  system.  The  total  costs  of  such  contracts.  Including 


O 


38 


profit,  wort  added  to  the  software  development  cost  on  the  theory  that 

( ") 

they  were  direct  costs  to  the  Air  Force  of  acquiring  the  software.  In 

% 

accordance  with  common  Air  Force  practice,  no  SPO  operating  costs  or 
Air  Force  personnel  costs  were  Included. 

Insofar  as  schedule  was  concerned,  no  attempt  was  made  to  separate 
the  developments  Into  phases.  Each  development  was  considered  to  have 
started  when  work  began  on  the  contract.  This  appeared  to  be  a reason- 
able  assumption,  at  least  for  the  systems  examined.  The  development 
end  date  was  considered  to  be  the  date  when  either  the  Air  Force  form- 
ally accepted  the  software,  or  It  was  formally  incorporated  into  a 
higher  level  system  for  the  purpose  of  system  testing  at  that  higher 
level. 

Statistical  Data  Collection.  This  was  by  far  the  most  time  con- 
suming part  of  the  data  collection  effort.  Its  purpose  was  to  collect 
the  data  necessary  to  describe  the  software  systems  studied  In  terms  of 
the  "hard”  PRICE  S Input  parameters.  It  Involved  a statistical  sampling 
of  the  Individual  modules  of  code  that  made  up  each  system.  Before  the 
sample  could  be  taken,  however,  several  preliminary  tasks  had  to  be 
accomplished. 

Once  the  initial  Interview  and  the  cost/schedule  data  had  been  col- 
lected, the  researcher  normally  spent  several  hours  becoming  familiar 
with  the  system  documentation,  with  the  assistance  of  the  software 
engineer  who  was  most  familiar  with  the  system  being  studied,  the 
researcher  was  always  able  to  find  the  Information  required  to  collect 
the  data  sample.  This  Inlcuded:  a complete  list  of  all  the  modules  In 
the  system,  a method  of  tracing  the  functional  hierarchy  of  the  modules. 


+J. 


R ?~'B«J'r?JS?W*5- *'* 'rpw ■ -•*■.« 


« system  diagram  that  related  the  modules  to  "major"  functional  tasks, 
and  a complete  copy  of  the  system  assembly  language  code.  When  the 
researcher  was  satisfied  with  his  ability  to  work  with  the  available 
documentation,  he  could  address  the  next  step. 

The  last  task  other  ^han  the  sampling  Itself  was  to  establish  a 
means  of  associating  each  module  chosen  with  a single  PRICE  S "applica- 
tions type."  This  was  done  with  the  help  of  the  software  engineer. 
Using  descriptive  terms  similar  to  those  in  Table  III,  each  "applica- 


tion type"  was  described  in  terms  of  the  additional  considerations  or 


constraints  which  acted  to  make  the  programmer's  job  more  difficult 


than  It  would  otherwise  be.  The  concepts  involved  were  discussed  until 


both  parties  agreed  that  a common  understanding  had  been  reached.  The 
researcher  then  asked  the  engineer  If  it  would  be  reasonable  to  assign 
a single  "application  type"  to  each  functional  area  on  the  system  dia- 
gram mentioned  above.  In  no  case  did  the  researcher  encounter  any 
problem  or  resistance  to  the  allocation  of  "application  types"  to  modules 
on  a simple,  functional  basis.  Once  this  was  accomplished,  statistical 
sampling  could  begin. 

Originally,  the  researcher  planned  to  look  at  every  module  In  every 
system  studied.  This  quickly  proved  to  be  Impractical.  As  an  alterna- 
tive, a sequential  sampling  technique  was  developed  which  resulted  In  a 
substantial  reduction  In  the  time  necessary  to  evaluate  a particular 


system. 


The  sequential  sampling  technique  was  based  on  the  "weighted  instruc- 
tion count”  (weight)  of  each  module.  In  equation  form: 


"i  * dJnu 


I 


TABLE  III 

PRICE  S Application  Types 


Application 

Type 

Weight 

Identifying  Characteristics 

System 

10.95 

Heavy  hardware  Interface. 

Many  Interactions. 

High  reliability  and  strict  timing 
requirements. 

Typified  by  operating  systems. 

Interactive 

10.95 

Interfaces  with  human  operators. 
Human  engineering  considerations 
and  error  protection  very 
Important. 

Real  Time  Command 
and  Control 

8.46 

Real  time  communications  under 
tight  timing  constraints. 

Queueing  not  practicable. 

Strict  protocol  requirements. 

Heavy  hardware  Interface. 

On-Line 

Communications 

6.17 

Real  time  communications  with 
queueing  allowed. 

Timing  restrictions  not  as 
restrictive  as  with  REAL  TIME 
COMMAND  AND  CONTROL. 

Data  Storage  and 
Retrieval 

4.10 

Secondary  storage  handling. 

Data  blocking  and  deblocking. 
"Hashing"  techniques. 

Hardware  oriented. 

String 

2.31 

Routine  applications  with  no 
overriding  constraints. 

Not  oriented  toward  mathematics. 
Typified  by  language  compilers. 

manipulation 

Mathematical 

Applications 

0.87 

Routine  mathematical  applications 
with  no  overriding  constraints. 

41 


where:  w^  ■ the  "weight"  of  the  I**1  module. 

dj  ■ the  "density,"  or  weighting  factor  associated  with 
application  type  j. 

n-f j » the  number  of  Instructions  in  module  1 which  happens 
to  be  of  application  type  j. 

The  weight  of  the  entire  system  is  then  simply  the  sum  of  the  weights 
of  the  individual  modules.  System  weight  Is  the  basis  from  which  cost  is 
calculated  in  the  PRICE  S model  (Frelman,  1977). 

Appendix  C contains  a detailed  derivation  of  the  sequential  sampling 
equation  used  in  the  research.  Briefly,  an  initial,  random  sample  of  50 
to  100  modules  was  chosen  without  replacement.  The  number  of  executable 
machine  Instructions,  application  type,  and  hierarchical  level  were  then 
recorded  for  each  module  chosen.  The  number  of  modules  required  to  ob- 
tain a 95  percent  confidence  level  that  the  mean  module  weight  was 
within  plus  or  minus  ten  percent  of  the  sample  mean  was  then  calculated 
using: 

N 

n " (1  + («7?  384)(N-l)/s^ 

where:  n * the  desired  sample  size. 

N * the  total  number  of  modules  in  the  system, 
w ■ the  mean  weight  of  the  latest  sample. 
s£*  the  sample  variance  of  w (module  weight). 

Sampling  without  replacement  was  then  continued  until  the  cumulative 
sample  size  reached  n.  At  this  point,  a new  sample  mean  and  variance 
were  calculated  and  the  process  was  repeated  until  the  new  value  of  n 


t 

5 


42 


was  less  than  or  equal  to  the  existing  sample  size.  The  weighting  fac- 
tors  used  for  each  application  type  were  those  being  used  by  RCA  PRICE 
Systems,  Inc.  on  9 May,  1977.  They  are  listed  in  Table  III. 

Methods  of  Analysis 

A number  of  analytical  methods  were  used  to  accomplish  the  research 
objectives  described  above.  Subjective  evaluation  was  used  to  analyze 
the  data  collection  methodology.  Statistical  analysis  techniques  were 
used  to  calculate  PRICE  S parameters  and  analyze  possible  improvements 
to  the  data  collection  methodology.  In  addition,  experimental  design  and 
sensitivity  analysis  techniques  were  used  to  calibrate  the  PRICE  S model. 
Each  of  these  methods  is  discussed  briefly  below. 

Subjective  Evaluation.  The  analysis  of  the  data  collection  method- 
ology was,  of  necessity,  primarily  subjective  in  nature.  It  was  oriented 
toward  two,  related  goals:  analyzing  data  collection  problems,  and 
evaluating  the  usefulness  of  the  data  actually  collected. 

Statistical  Analysis.  With  the  exception  of  the  sequential  sampling 
technique  described  above,  the  statistical  analysis  methods  used  In  the 
research  were  neither  extensive  nor  complicated.  Point  estimation  tech- 

t m 

niques  were  used  to  calculate  PRICE  S input  parameters  and  other  descrip- 
tive statistics.  In  addition,  "t-tests"  were  used  to  test  some  hypotheses 
concerning  possible  Improvements  to  the  statistical  data  collection 
methodology.  Freund's  Mathematical  Statistics  contains  an  excellent 
discussion  of  both  of  these  techniques.  (Freund,  1971) 

Although  It  is  not  a formal  statistical  analysis  technique,  it  should 
be  noted  that  frequency  distribution  plots  were  used  to  graphically 
illustrate  the  probability  distribution  of  "number  of  instructions  per 


O 


43 


module"  within  the  software  systems  studied.  To  generate  this  type  of 
plot*  the  possible  values  which  a particular  random  variable  may  assume 
are  divided  Into  an  ordered  set  of  equal  sized  Intervals,  called  cells. 

A random  sample  of  that  variable  Is  then  selected.  The  proportion  of 
times  that  the  variable  assumes  a value  within  each  cell  Is  then  plotted 
against  the  ordered  set  of  cells.  The  resulting  plot  approximates  the 
shape  of  the  probability  density  function  for  the  random  variable  In 
question. 

PRICE  S Methodology.  The  data  collected  on  each  system  was  used  to 
calculate  the  initial  Inputs  for  the  PRICE  S model.  Since  all  systems 
studied  fell  under  the  general  category  of  "military  avionics,"  the 
"soft"  Input  parameter  PLTFM,  which  measures  the  stringency  of  the 
development  specifications,  was  set  to  1.8.  This  Is  the  value  recom- 
mended by  RCA  PRICE  Systems,  Inc.  for  "MIL-Spec  Avionics"  systems.  (RCA 
PRICE  Systems,  1977:27)  Using  these  values,  a preliminary  PRICE  S run 
was  made  to  calculate  Initial  values  for  the  "soft"  parameters  SDCPLX 
and  ENCPLX,  which  relate  the  difficulty  of  the  development  task  to  "the 
shop  doing  the  work,"  and  "the  normal  time  required  for  Its  accomplish- 
ment," respectively.  (RCA  PRICE  Systems,  1977:12,  20) 

Once  the  Initial  run  had  been  made,  the  researcher  made  a series  of 
runs  to  evaluate  the  results  of  "adjustments"  to  the  Initial  Input 
parameters.  Even  with  the  ground  rule  that  no  parameter  which  had  been 
objectively  measured  could  be  altered,  this  was  a very  subjective  process. 
The  Intent  of  the  adjustments  was  to  find  a set  of  Input  parameters  which 
both  Included  the  objectively  measured  data  and  satisfied  the  researcher 
than  It  adequately  reflected  his  subjective  evaluation  of  the  development 

O 

44 


effort.  The  result  of  the  adjustment  runs  was  a set  of  "baseline"  PRICE 
S parameters,  which  the  researcher  believed  best  reflected  the  system 
development  being  studied. 

In  order  to  provide  for  comparability  between  systems,  the  baseline 
parameters  were  then  Input  to  a "baseline"  PRICE  S run.  For  this  run, 
the  project  cost  and  schedule  were  estimated  In  constant  1977  dollars, 
with  the  project  start  date  (ENDSS)  adjusted  to  1 February  1977.  (The 
February  date  Is  convenient  for  PRICE  S users  because  of  peculiarities 
In  the  computer  program.)  This  baseline  run  was  the  starting  point  for 
a series  of  twenty  PRICE  S runs  that  were  designed  to  provide  data  for  a 
sensitivity  analysis  of  the  model. 

Six  parameters  were  selected  for  a formal  analysis  of  the  sensi- 
tivity of  the  PRICE  S model  to  Input  parameter  changes.  Three  of  these 
parameters  were  believed  to  have  a realtlvely  simple,  straightforward 
effect  on  the  cost  and  schedule  estimates  output  by  the  model.  These 
parameters  were:  "the  total  number  of  instructions"  (TNINST),  "the 
Instruction  density"  (IDENS),  and  "fraction  of  available  capacity  utilized" 
(UTIL).  To  analyze  the  effects  of  these  variables  upon  estimated  cost 
and  schedule  duration,  each  of  these  parameters  was  varied  separately. 

In  small  Increments  about  Its  baseline  value  while  the  other  PRICE  S 
Inputs  were  held  constant.  Since  the  baseline  run  acted  as  the  center 
point  for  each  variable,  four  PRICE  S runs  were  sufficient  to  establish 
five  data  points  for  each  of  the  three  parameters  examined  In  this  way. 

The  other  three  parameters  which  were  evaluated  were  the  three  "soft" 
parameters  described  above:  SDCPLX,  ENCPLX,  and  PLTFM.  Because  the 
effects  of  variations  In  these  parameters  was  not  necessarily  either 


I 


I 


45 


o 


I 

n 

I:  | 

U 

u 


simple  or  straightforward.  It  was  decided  to  use  a more  sophisticated 
experimental  design  for  this  part  of  the  analysis.  In  order  to  provide 
sufficient  data  for  an  analysis  of  the  Interactions  between  these  param- 
eters without  unnecessarily  Increasing  the  number  of  PRICE  S runs  required. 
It  was  decided  to  use  a full  factorial  experimental  design  with  two  levels 
for  each  of  the  three  variables  being  analyzed.  Thus,  eight  PRICE  S runs 
were  required  to  reflect  all  possible  combinations  of  these  Input  vari- 
ables. Since  the  PRICE  S model  Is  deterministic  In  nature.  In  that  the 
same  Inputs  always  produce  the  same  outputs.  It  was  not  necessary  to 
either  randomize  or  replicate  these  runs. 

Sensitivity  Analysis  Techniques.  Two  mathematical  techniques  were 
used  to  analyze  the  results  of  the  sensitivity  analysis  runs.  These  are 
linear  regression  and  analysis  of  variance  (ANOVA).  As  used  In  this 
study,  linear  regression  is  a widely  accepted  method  of  fitting  linear 
equations  to  numerical  data,  and  ANOVA  Is  a method  of  analyzing  the 
degree  of  Interaction  between  independent  variables  In  their  effect  on 
dependent  variables.  Since  the  data  being  analyzed  was  not  the  result 
cf  random  (or  pseudo-random)  processes,  these  techniques  were  not  used  as 
tools  of  statistical  analysis.  The  Interested  reader  Is  referred  to 
Draper  and  Smith  for  an  excellent  treatment  of  regression,  as  well  as  an 
Interesting  approach  to  ANOVA.  (Draper  and  Smith,  1966)  Freund  provides 
a more  traditional  discussion  of  ANOVA.  (Freund,  1971) 


System  Descriptions 

The  final  phase  of  the  research  was  the  development  of  a format  for 
recording  historical  data  on  software  systems.  This  format  Is  Intended 
to  provide  a medium  for  recording  the  essential  facts  necessary  to 


46 


describe  a historical  software  system  In  PRICE  S terms.  This  format  was 
developed  with  a number  of  guidelines  and  restrictions  In  mind.  First, 
It  was  decided  to  restrict  the  format  to  no  more  than  one  page  In  length. 
At  the  same  time.  It  was  desired  that  the  format  be  as  simple  and  self 
explanatory  as  possible.  In  addition,  while  the  prime  emphasis  of  the 
format  was  to  record  factual  data,  sufficient  Information  to  coonunlcate 
the  purpose  and  scope  of  the  system  should  be  provided.  Finally,  there 
had  to  be  a way  of  recording  any  "special  case"  Information  that  would 
be  of  Interest  to  another  cost  analyst.  The  characteristics  of  the  two 
systems  studied  were  recorded  on  the  new  format.  This  Information  Is 
presented  In  Appendix  D. 


IV. 


ANALYSIS  AND  RESULTS 

© 

Overview  of  the  Systems  Studied 

All  In  all,  ten  software  systems  were  considered  for  Inclusion  In 
this  study.  These  Included  software  developed  for  the  following  weapons 
systems  managed  by  ASD: 

8-1  (two  systems  considered) 

AMST 

EFin 

ALQ131 

PAVE  TAC 

F-16 

ALS 

Wild  Weasel 
F-15 

Although  all  the  project  personnel,  who  were  contacted  were  most 
cooperative,  only  two  of  the  ten  systems  passed  the  screening  criteria 
described  above  In  Chapter  III  (The  Initial  Interview).  These  systems 
will  be  referred  to  In  this  report  as  System  A and  System  E.  Each  Is 
described  briefly  below. 

System  A.  This  software  system  Is  part  of  a sophisticated  electronic 
jamnlng  system.  The  software  analyzes  and  Identifies  radar  signals  and 
allocates  jamming  resources  In  order  to  counter  threats.  Although  the 
software  Is  resident  on  two  separate  computers.  It  was  developed  by  a 
single  team  under  unified  management.  In  general,  the  software  on  one 
machine  does  the  actual  jamming  tasks,  while  the  software  on  the  other 
( ) machine  controls  the  Interface  with  the  human  operator.  The  entire  system 

48 


i , i.r>  H'.  j*.  *.  .•  ^ 


System  E.  This  software  system  Is  unusual  for  an  operational 
flight  system,  since  It  does  not  leave  the  ground.  It  does  exercise 
control  over  an  aircraft  In  flight.  In  real-time,  however,  so  It  oust  be 
considered  to  be  In  the  sane  class  as  other  operational  flight  software 
systeos.  Its  purpose  Is  to  navigate  and  control  a guided  Munition  along 
a given  trajectory.  The  software  Is  resident  In  a ground-based  cooputer 
which  coamunl cates  with  the  Munition  and  with  Its  parent  aircraft  through 
a real-time  data  link.  It  was  developed  as  part  of  an  advanced  develop- 
ment effort  to  build  a prototype  weapon  system.  Failure  of  this  system 
could  result  in  loss  of  control  of  high  explosive  munitions  In  flight, 
this  could  endanger  human  life. 

System  E was  developed  concurrently  with  the  rest  of  the  prototype 
weapon  system.  The  software  developer  was  not  responsible  for  most  of 
the  hardware  development,  which  was  under  a separate  contract.  Several 
other  tasks,  not  related  to  the  software  development,  were  Included  in 
the  software  development  contract.  The  costs  of  these  tasks  were  re- 
ported separately,  however,  and  could  be  "backed  out"  of  the  total  con- 
tract costs.  The  ASO  project  manager  who  was  responsible  for  monitoring 
the  software  development  effort  reviewed  the  researcher's  cost  calcula- 
tions. He  Indicated  that  although  some  nonnelevant  costs  were  probably 
still  contained  In  the  total,  they  were  probably  not  large  enough  to  be 
considered  significant. 

The  System  E software  development  team  was  considered  to  be  excep- 
tionally well  qualified  for  the  task  by  the  ASD  software  engineer  who 
was  most  familiar  with  the  project.  In  addition,  this  Individual 


- 


50 


Indicated  that  there  was  considerable  pressure  to  finish  the  development 
more  quickly  than  "normal."  The  specifications  were  described  as  being  , 
lax*  but  did  not  seem  to  be  Inordinately  so.  Computer  memory  capacity 
was  perceived  as  a problem,  but  hardware  timing  was  not.  As  with  System 
A,  this  system  was  designed  to  minimize  the  Impact  of  hardware  timing 
problems. 

Evaluation  of  the  Data  Collection  Methodology 

The  Intltal  Interview.  The  Interview  format  contained  In  Appendix  B 
served  several  useful  purposes.  It  provided  the  researcher  with  an  over- 
view of  the  software  system.  It  also  served  as  a filter,  allowing  the 

I 

researcher  to  determine  within  an  hour  or  less  whether  or  not  the  system 


In  question  was  a suitable  subject  for  further  study*  In  addition.  It 
served  as  a simple  medium  for  collecting  some  of  the  descriptive  Informa- 
tion, both  objective  and  subjective,  that  should  be  Included  In  a his- 
torical software  data  base.  In  particular,  the  questions  concerning  the 
operating  environment  of  the  software  system,  the  computer  on  which  the 
system  would  operate,  and  the  functions  performed  by  the  system  are 
essential  to  the  description  of  any  software  system.  A few  points  con- 
cerning the  Initial  Interview  are  worthy  of  special  note. 

The  list  of  "major  sub-functions  performed"  shows  promise  of  becoming 
an  excellent  tool  for  assessing  how  comparable  two  systems  really  are. 

The  Intent  of  the  question  was  to  Identify  the  component  functions  which 
the  system  performed  In  achieving  Its  ultimate  purpose.  The  software 
engineers  contacted  by  this  researcher  were  always  able  to  provide  com- 
plete, concise  answers  to  this  question  with  very  little  effort.  Given 
this  Information,  It  should  be  possible  to  compare  the  list  of  functions 


51 


- . .j 


performed  for  two  purportedly  similar  systems.  Such  a comparison  should 
quickly  reveal  whether  or  not  the  similarity  Is  only  superficial.  It 
should  also  reveal  the  major  functional  areas  of  difference  between  two 
similar  systems.  Of  course,  other  considerations  such  as  the  method  of 
Implementation  used  and  the  accuracy  required  must  be  considered  before 
the  similarity  between  the  systems  can  be  established  In  detail. 

1 

Not  all  the  questions  on  the  Interview  format  worked  as  well  as 
anticipated,  however.  In  retrospect,  it  Is  now  apparent  that  the  ques- 
tion which  asked  for  an  estimate  of  the  percentage  of  system  capacity 
actually  utilized  did  not  work  as  anticipated.  Such  a determination 
should  be  based  on  objective  test  data,  or  measurements  made  by  the  re- 
searcher. This  question  Is  useful  for  determining  whether  or  not  system 
capacity  problems  affected  the  development  effort.  Given  an  Indication 
that  such  problems  existed,  however,  the  researcher  should  seek  objec- 
tive data  concerning  the  extent  of  the  problem. 

There  are  several  ways  that  the  researcher  may  seek  objective  data 
concerning  the  fraction  of  system  capacity  utilized.  First,  If  the 
project  was  of  any  size  and  the  problem  was  serious,  there  should  be  some 
correspondence,  formal  reports,  or  other  documentation  of  the  extent  of 
the  problem.  If  not,  the  researcher  may  attempt  to  estimate  the  extent 
of  the  problem  himself.  For  simple  memory  capacity  problems,  where  no 
secondary  storage  Is  Involved,  it  may  be  possible  to  determine  the  frac- 
tion of  system  capacity  utilized  by  analyzing  a system  "load  map."  Also, 
a simple  estimate  of  processing  capacity  utilized  may  often  be  calculated 
from  formal  test  reports.  However,  If  system  capacity  usage  cannot  be 
defined  In  a simple,  straightforward  manner,  the  researcher  should  not 


52 


attempt  to  estimate  the  fraction  of  capacity  utilized  without  assistance 
from  knowledgeable  systems  engineers. 


Cost/Scehdule  Data  Collection.  The  collection  of  accurate  cost  and 
schedule  data  for  avionics  software  development  projects  poses  some  unique 
problems  even  if  the  trend  toward  more  detailed  reporting  of  system 
development  costs  continues.  Based  on  discussions  with  the  ASO  personnel 
responsible  for  monitoring  the  development  of  ten  separate  avionics 
software  systems.  It  Is  apparent  that  such  software  Is  seldom  procured 
by  Itself.  Rather,  avionics  software  Is  normally  developed  as  an  Integral 
part  of  a larger  system  which  Is  made  up  of  both  hardware  and  software 
components. 

Under  such  conditions,  two  very  basic  problems  arise  for  the  re- 
search er  who  Is  Interested  In  Isolating  software  cost/schedule  data. 

First,  It  will  normally  be  advantageous,  perhaps  even  essential, 
for  the  system  developer  to  organize  some  functions  that  contribute  to 
software  development  at  the  overall  project  level.  This  Is  especially 
true  of  systems  engineering;  to  a lesser  extent  It  applies  to  configura- 
tion management,  documentation,  and  program  management  as  well.  Normally, 
only  the  direct  cost  of  the  programming  function  Itself  will  be  directly 
allocable  to  software  development.  This  problem  Is  compounded  for  ASD 
because  each  contractor  must  be  allowed  the  latitude  to  organize  his 
development  team  as  he  sees  fit.  Thus,  the  existence  of  a separate, 
project-wide  systems  engineering  cost  account  will  mean  different  things 
for  different  projects.  This,  combined  with  the  fact  that  systems 
engineering  work  Is,  by  Its  very  nature  directly  allocable  to  a specific 
system  component  such  as  software,  makes  the  "true  cost  of  software’ 


development"  quite  difficult  to  define,  much  less  measure.  In  the  opinion 
of  this  researcher,  this  problem  Is  endemic  to  the  way  that  systems  are 


procured  at  ASD.  As  a result,  any  attempt  to  force  all  costs  that  are 


relevant  to  software  development  Into  a discrete  set  of  cost  accounts 
would  be  futile.  If  not  actually  counterproductive.  Instead,  It  would  be 
better  for  future  researchers  to  agree  on  a consistent  convention  for 
allocating  systems  engineering  costs  to  software  development.  Such  an 
allocation  could,  for  example,  be  made  on  the  basis  of  the  "direct 
engineering  man-hours"  associated  with  the  design  of  the  various  system 


components. 


Another,  related  problem  concerns  the  length  of  the  software  develop- 
ment effort.  At  some  point  In  time,  avionics  software  ceases  to  be  a 


separate  entity  and  Is  Incorporated  Into  the  system  of  which  It  Is  a 


part.  Ordinarily,  this  occurs  well  before  the  system  development  Itself 
Is  complete,  since  the  entire  system  must  be  tested.  If  the  software 
development  team  was  disbanded  at  this  point,  and  no  more  costs  were 


Incurred  for  software,  this  would  create  no  problems.  This  Is  seldom  the 


case,  however.  The  software  development  team  normally  transitions 


naturally  from  the  development  phase  of  the  software  system  life-cycle  to 
the  maintenance  phase  while  the  overall  project  Is  still  In  development. 


Thus,  software  related  costs  continue  to  be  Incurred  even  after  the 


software  development  Is  complete. 


The  solution  to  this  problem  Is  not  nearly  as  difficult  as  the  cost 


allocation  problem.  The  formal  certification  that  the  software  Is  ready 
for  testing  at  the  next  higher  system  level  could  easily  be  made  a con- 
gractural  milestone.  The  date  on  which  this  milestone  occurs  should  be 
considered  to  be  the  end  of  the  software  development  effort  for  historical 


54 


data  collection  purposes.  Any  software  related  costs  occurlng  after  this 
date  should  be  allocated  to  software  maintenance.  In  the  case  of  larger 
systems,  where  It  Is  advantageous  to  control  software  on  the  basis  of  a 
unit  smaller  than  the  entire  system,  this  transition  could  be  considered 
to  occur  on  a unit-by-unit  basis,  rather  than  as  a single,  system-wide 
transition. 

Statistical  Data  Collection.  The  sequential  sampling  technique  that 
was  used  to  measure  the  proportion  of  Instructions  that  were  of  each 
"application  type"  served  Its  purpose  of  significantly  lowering  the  num- 
ber of  modules  that  had  to  be  examined.  The  distribution  of  "number  of 
Instructions  per  module"  that  was  encountered,  however,  caused  this  tech- 
nique to  be  less  efficient  than  was  originally  anticipated.  Figures  2 
and  3 show  the  actual  distribution  of  "instructions  per  module"  that  was 

encountered  for  the  two  systems  examined.  These  figures  show  clearly 
• 

that  there  was  a heavy  preponderance  of  small  modules.  In  both  cases, 
the  most  frequently  occurring  number  of  Instructions  per  module  was  less 
than  twenty.  Application  of  the  "weighting  factor"  (due  to  PRICE  S 
application  types)  to  the  number  of  Instructions  per  module  did  not 
appreciably  change  the  shape  of  these  distributions,  as  Is  shown  by 
Figures  4 and  5. 

Table  IV  summarizes  the  calculation  of  the  number  of  modules  to  be 
sampled  for  the  two  systems  examined.  In  both  cases,  the  number  of 
modules  required  was  significantly  more  than  one  half  of  the  total  number 
of  modules  In  the  system.  It  must  be  noted  that  In  the  case  of  System  E, 

v » 

two  extremely  large  "real-time  controJ " type  modules,  which  together  con- 
tributed almost  14  percent  of  the  total  "weight"  of  the  system  were 

A* 

eliminated  from  the  calculations.  This  was  done  for  two  reasons. 


55 


Instructions  per  Module 

Frequency  Distribution  for  Instructions  per  Module  - System  A 


IA 

i n 

Mm. 


TABLE  IV 

Summary  of  Sample  Size  Calculations 


Variable 


Explanation 


Values 

System  A System  E* 


415.6 


433.4 


0.587 


N Total  modules  In  system  307 

w Mean  "weight"  per  module  491.4 

sw  Standard  deviation  of  weight 

per  module  648.3 

p Fraction  of  total  modules 

required  for  the  sample  0.686 

n Number  of  modules  required 

In  sample  211 

♦two  modules  were  deleted  from  the  System  E calculations. 


First,  there  was  evidence  that  these  two  modules  were  unique,  and  that 
they  may  have  been  Incorporated  Into  the  system  virtually  unchanged  from 
previous  work  done  by  the  developer.  In  addition,  their  combined  effect 
on  the  total  number  of  modules  required  for  the  sample  was  significant. 
Eliminating  these  two  modules  reduced  the  proportion  of  modules  required 
for  the  sample  by  17  percent  (from  75X  to  58X),  a change  of  about  50 
modules.  Of  course  the  effects  of  this  decision  on  the  other  calculations 
regarding  System  E were  evaluated.  In  no  case  was  a significant  change 
to  any  of  the  calculated  system  parameters  discovered. 

This  researcher  has  discussed  the  observed  distribution  of  "number 
of  Instructions  per  module"  with  several  Individuals  who  are  familiar 
with  avionics  software  systems.  As  a result  of  these  discussions.  It 
seems  reasonable  to  assume  that  the  distributions  observed  are  typical  of 
this  type  of  system.  If  so,  and  If  the  experience  of  this  researcher  may 
be  considered  to  be  typical,  future  researchers  may  expect  to  devote 


approximately  two  man-weeks  to  the  measurement  of  a single  avionics  soft- 
ware system  containing  approximately  300  Individual  modules.  Approxi- 
mately one  half  of  this  time  will  be  spent  examining  Individual  modules 
and  physically  counting  Instructions.  The  extension  of  this  technique  to 
larger  software  systems  would  Involve  proportionally  more  work. 

In  an  effort  to  reduce  the  effort  that  future  researchers  would  have 
to  devote  to  statistical  data  collection  efforts  of  this  type,  the  re- 
searcher Investigated  the  possibility  that  the  need  to  count  individual 
machine  Instructions  could  be  avoided.  The  most  obvious  way  to  do  this 
would  be  to  calculate  the  "mix"  of.  application  types  for  a given  system 
using  the  number  of  modules  of  each  type  rather  than  the  number  of 
Instructions  In  these  modules.  The  data  from  both  Systems  A and  E were 
used  to  perform  these  calculations.  The  results,  shown  In  Table  V, 
Indicate  that  the  "mix"  calculated  using  the  two  methods  Is  quite  dif- 
ferent. Thus,  a simple  elimination  of  the  requirement  to  count  instruc- 
tions did  not  seem  to  be  warranted. 

One  further  effort  was  made  to  eliminate  the  requirement  for  future 
researchers  to  count  Individual  Instructions.  It  was  reasoned  that  If 
modules  of  each  "application  type"  could  be  shown  to  contain  the  same 
mean  number  of  Instructions,  Independent  of  the  software  system  Involved, 
then  a simple  adjustment  to  the  weights  for  each  application  type  could 
be  made.  It  was  thus  hypothesized  that:  the  mean  number  of  Instructions 
per  module  for  each  application  Is  equal  In  Systems  A and  E.  This 
hypothesis  was  tested  using  the  data  on  the  "Interactive"  modules  from 
the  two  systems,  and  a two-tailed  t-test  for  the  equality  of  means.  The 
results  of  this  test,  shown  In  Table  VI,  Indicate  that  the  hypothesis 


TABLE  V 


Comparison  of  Alternative  Methods  of  Computing 

"Mix" 

Application 

Type 

Weighting 

Factor 

SYSTEM  A 

Fraction  of 
Modules 

Sampled 

Fraction  of 
Instructions 
Sampled 

System 

10.95 

.146 

.162 

Interactive 

10.95 

.421 

.321 

Real  Time 

8.46 

.151 

.130 

On  Line 

6.17 

.140 

.153 

Data 

4.10 

.027 

.051 

String  ( 

2.31 

.114 

.182 

Math 

.87 

.000 

.000 

I DENS  CALCULATED  USING 

Module 

Instruction 

Data 

Data 

8.71 

7.95 

SYSTEM  E 


Application 

Type 

Weighting 

Factor 

Fraction  of 
Modules 
Sampled 

Fraction  of 
Instructions 
Sampled 

System 

10.95 

.135 

.112 

Interactive 

10.95 

.435 

.265 

Real  Time 

8.46 

.276 

.456 

On  Line 

6.17 

.059 

.034 

Data 

4.10 

.000 

.000 

String 

2.31 

.006 

.011 

Math 

.87 

.088 

.123 

IDENS  CALCULATED  USING 

Module 

Instruction 

Data 

Data 

9.06 

8.36 

Comparison  of  "Interactive"  Modules  From  Itoo  Systems 


Number  of  modules  In  data 
sample 

Mean  number  of  Instructions 
per  module 

Sample  variance  (x) 


Hypothesis 


Alternate 


(Freund,  1971:319) 


Equation 


C(s^/nA)  + (s*E/nE)] 


Therefore:  z * 1.51 

a > .131  (two-tailed  test) 

Conclusions:  The  hypothesis  cannot  be  formally  rejected.  The 

alternate  hypothesis  Is  more  likely  to  be  true  than 
the  null  hypothesis. 


should  probably  be  rejected,  although  the  confidence  level  Is  only  0.87 
Thus,  this  researcher  was  not  able  to  Identify  a method  of  obtaining 
statistical  data  equivalent  to  that  collected  In  this  study  without 
counting  Instructions. 


PRICE  S.  Results 


The  results  of  the  PRICE  S runs  fall  logically  Into  two  groups.  First 
the  Initial  runs,  and  the  adjustments  necessary  to  establish  a set  of 


DATA 

Variable 

Explanation 

System  A 

System  E 

baseline  PRICE  S parameters  will  be  discussed  for  each  of  the  two  systems. 
Finally,  the  results  of  the  sensitivity  analysis  will  be  addressed. 


O 

System  A Adjustments.  The  Initial  PRICE  S run  for  System  A yielded 
a value  of  SDCPLX  that  was  less  than  two.  This  Is  Indicative  of  a system 
which  Is  extremely  easy  for  the  shop  doing  the  work.  This  was  not  In 
consonance  with  the  Information  gathered  In  the  Initial  Interview.  Estl- 
mating  the  total  cost  for  the  system  using  the  more  "reasonable"  value 
of  3.5  for  SDCPLX  yielded  a total  cost  that  significantly  exceeded  the 
measured  cost  of  the  software.  To  resolve  this  discrepancy,  the  cost  of 
the  systems  engineering  function  was  deleted  from  the  PRICE  S estimate. 
(The  program  provides  a simple  means  of  accomplishing  this.)  The  deci- 
sion to  make  this  adjustment  was  based  on  the  fact  that  systems  engin- 
eering costs  were  recorded  In  a separate,  project-wide  account  and  were 
not  Included  In  the  measured  software  development  cost.  Once  this 
adjustment  had  been  made,  the  value  of  SDCPLX  which  the  model  produced 
was  raised  to  a value  of  3.3.  The  researcher  Interprets  this  value  as 
Indicating  a competent,  but  not  exceptional  work  group  working  on  a 
project  which  is  well-defined  and  similar  to  previous  work.  The  baseline 
PRICE  S parameters  for  System  A are  listed  In  Table  VII. 

System  E Adjustments.  The  baseline  parameters  for  System  E were 
established  with  much  less  trouble  than  was  required  for  System  A.  This 
does  not  Indicate,  however,  that  these  parameters  are  more  creditable 
than  those  of  System  A.  To  the  contrary,  the  ease  with  which  the  base- 
line was  established  reflects  the  researcher's  inability  to  establish 
firm  "acceptability  limits"  for  the  values  of  three  key  parameters: 
SDCPLX,  ENCPLX,  and  UTIL. 


64 


System  A 


System  E 


Descriptive  Parameters 


TNINST 

19,000 

16,000 

IDEAS 

7.965 

8.365 

SDCPLX 

3.3 

3.3 

fumct 

307 

297 

FCONST 

4.44 

2.848 

LEVEL 

2.6 

4.0 

"Mix"  Parameters 

MATH 

.00 

.12 

STRING 

.18 

.01 

SYSTEM 

.16 

.11 

• 

DATA 

.05 

.00 

ON-LINE 

.16 

.03 

REAL-TIME 

.13 

.46 

INTERACTIVE 

.32 

.27 

Other  Parameters 

ENCPLX 

2.26 

1.9* 

PLTFM 

1.8 

1.8 

uni 

0.6 

0.8* 

*These  parameters  were  chosen  by  the  researcher  to  reflect  subjective 
considerations  as  well  as  objective  measurements. 

s 


As  stated  above,  the  Initial  Interview  revealed  that  the  development 
team  was  considered  to  be  exceptionally  well  qualified  for  their  task. 

In  addition,  there  was  considerable  pressure  on  the  team  to  shorten  the 
software  development  scheduled.  Finally,  hardware  capacity  limitations 
were  perceived  to  be  a problem  by  the  Individuals  familiar  with  the 
project.  No  objective  measure  was  made  of  the  actual  fraction  of  the 
systae  capacity  tulllzed.  Since  only  two  PRICE  S model  outputs  (total 
system  cost  and  schedule  duration)  were  considered  In  this  study,  uncer- 
tainty In  three  Input  parameters  meant  that  the  model  solution  was 
underdetermined.  Thus,  any  number  of  "acceptable”  combinations  of 
values  of  SDCPLX,  ENCPLX,  and  UTIL  could  be  used  to  generate  the  desired 
values  for  the  cost  and  schedule  outputs.  Thus,  the  actual  baseline 
values  chosen  (Table  VII)  must  be  considered  to  be  only  typical  of  an 
Infinite  set  of  equally  acceptable  values. 

Sensitivity  Analysis.  The  purpose  of  the  sensitivity  analysis  was 
twofold.  First,  It  was  desired  to  determine  the  reaction  of  the  PRICE  S 
model  to  small  changes  in  the  various  Input  parameters  In  the  neighbor- 
hood of  the  system  baseline  parameters.  Second,  It  was  desired  to  see 
If  there  was  any  significant  Interaction  between  the  "soft"  Input 
parameters  In  their  effect  on  the  model  cost  and  schedule  outputs.  The 
data  necessary  to  achieve  both  of  these  purposes  was  generated  from  the 
20  "sensitivity  analysis"  PRICE  S runs  described  In  Chapter  III. 

Since  the  PRICE  S model  Is  based  on  parametric  equations.  It  was 
expected  that  the  surfaces  described  by  the  sensitivity  analysis  runs 
would  appear  to  be  smooth  and  continuous.  Such  was  indeed  the  case. 
Figures  6 through  17  stmuarlze  the  results  of  the  runs  which  allowed  the 
parameters  TNINST,  I DENS  and  UTIL  to  vary  Independently.  From  these 


J 


66 


Total  Instructions  (x  10*) 


Figure  6.  TNINST  vs.  COST  - System  A 


Total  Instructions  (x  10*) 
Figure  7.  TNINST  vs.  COST  - System  E 


Total  Instructions  (x  10s) 

Total  Instructions  vs.  Schedule  - System  A 


Figure  8 


Total  Instructions  (x  10*) 
Figure  9.  Total  Instructions  vs.  Schedule 


System  E 


mtm * 


Instruction  Density 

Figure  10.  Development  Cost  vs;  Instruction  Density  - System  A 


Instruction  Density 

Figure  11.  Development  Cost  vs.  Instruction  Density  - System  E 


Instruction  Density 

Figure  12.  Schedule  Duration  vs.  Instruction  Density  - System  A 


Instruction  Density 

Figure  13.  Schedule  Duration  vs.  Instruction  Density  - System  E 


Capacity  Utilization 

Development  Cost  vs.  Capacity  Utilization  - System  A 


Figure  14 


(x  10*) 


Capacity  Utilization 

Figure  IS.  Development  Cost  vs.  Capacity  Utilization  - System  E 


figures.  It  Is  apparent  that  the  only  relationship  that  Is  significantly 
nonlinear  Is  the  effect  of  UTIL  on  software  development  cost.  This  Is 
especially  true  of  System  E,  where  the  baseline  value  of  UTIL  was  0.8, 
a rather  high  value.  A severe  cost  penalty  Is  associated  with  the 
Increased  effort  necessary  to  “force"  the  system  to  fit  Into  the  limited 
capacity  available.  The  exponential  shape  of  this  cost  curve  Is  not 
unreasonable  considering  the  Increasing  restrictions  the  development 
team  must  overcome  as  the  capacity  limitations  become  more  severe. 

It  Is  Interesting  to  note  that  the  PRICE  S runs  showed  no  effect 
at  all  of  UTIL  on  schedule  duration.  Intuitively,  this  does  not  seem  to 
be  correct.  The  problems  associated  with  capacity  limitations  are  not 
generally  solved  by  hiring  more  people,  or  buying  equipment.  They  are 
usually  solved  through  a process  of  continuing  refinement  that  simply 
takes  time.  The  failure  of  PRICE  S to  reflect  this  may  Indicate  an 
oversight  or  an  error  In  the  particular  version  of  the  PRICE  S computer 
program  used. 

The  effects  of  variations  of  the  "soft"  PRICE  S parameters  on  cost 
and  schedule  were  examined  using  multivariate  linear  regression  and 
three  way  ANOVA  techniques.  Table  VIII  stmnarlzes  both  the  regressions, 
and  the  ANOVA  results.  It  Is  obvious  from  these  two  tables  that.  In 
the  neighborhood  of  the  baseline  parameters,  the  "soft"  parameters 
effects  are  also  quite  linear.  In  addition,  no  evidence  of  significant 
Interactions  between  the  Input  variables  were  found.  Thus,  at  least 
within  the  neighborhood  of  a particular  set  of  PRICE  S parameters,  these 
parameters  may  be  treated  as  an  orthogonal  set  of  Independent  parameters 

The  results  of  the  sensitivity  analysis  seem  quite  reasonable,  with 
one  qualification.  The  lack  of  any  dependency  of  schedule  duration  on 


ess:  SS  due  to  ASDCPLX 


and  all  other  effects 


ANALYSIS  OF  VARIANCE 
(ANOVA) 


Total  Sum  of  Squares  (SS) 
Less:  SS  due  to  grand  mean 


uc  to  a 
Less:  SS  due  to  ASDCPLX 


ess:  SS  due  to 


ess:  SS  due  to  APLTFM 


nteractlons,  errors, 
and  all  other  effects 


Total  Sum  of  Squares  (SS) 
ess:  SS  due  to  grand  mean 


COST 

SYSTEM  A SYSTEM  E 

34,181,443  31,691,305 

33,714.366  31,248.465 


125.250  135 


259,560  2 


2,067  2,087 

SCHEDULE 

SYSTEM  A SYSTEM  E 


s:  SS  due  to  AENCPLX 


nteractlons,  errors 


11,586.14 
11,556.56 


29.26 


7,269.54 
7.236 


TABLE  VIII 

Sensitivity  Analysis  Results 


Regression  Models:  ACOST  » a0  ♦ a, ASDCPLX  + AENCPLX  + a, APLTFM 

ASCHED  - b0  + bj  ASXPLX  + b2 AENCPLX  + a,  APLTFM 

SYSTEM  A SYSTEM  E 


1 

*1 

b1 

1 

®1 

b1 

0 

6 

.01 

0 

6 

.26 

1 

1001 

1.98 

1 

951 

-.95 

2 

1251 

19.13 

2 

1301 

20.30 

3 

1801 

0.0 

3 

1706 

0.0 

0.01 


PITFM  lacks  intuitive  appeal.  Increases  In  specification  requirements 
Imply  more  detailed  test  plans  and  procedures,  more  extensive  “debug- 
ging" efforts,  and  more  testing.  It  would  be  expected  that  all  of 
these  factors  would  tend  to  lengthen  the  development  schedule. 


1 


I 


V.  SUMMARY  AND  CONCLUSIONS 


The  stated  purpose  of  this  research  was  to  Investigate  ways  of 
gathering  and  using  descriptive  data  for  the  purpose  of  making  prelimin- 
ary software  cost  estimates.  Such  data  Is  necessary,  because  the 

- 

analysts  who  make  preliminary  cost  estimates  for  software  at  ASD  rely 

V 'm 

heavily  on  hisotrlcal  analogy.  Considering  the  state-of-the-art  of 

1 

software  cost  estimation,  It  appears  that  this  will  continue  to  be  the 

; ....  'V  , . J 

case  for  the  forseable  future. 

The  purpose  of  the  research  was  achieved  through  the  accomplish- 
ment of  two  specific  objectives.  These  were  to  develop  a methodology 

1 

for  gathering  descriptive  data  on  software  systems,  and  to  use  histori- 
cal ASD  software  acquisition  data  to  calibrate  the  PRICE  S model.  These 
objectives  were  related  In  that  the  data  necessary  to  achieve  the 
second  objective  was  collected  using  the  methodology  developed  In 
achieving  the  first.  This  allowed  the  researcher  to  evaluate  the  effec- 
tiveness of  the  methodology  In  terms  of  the  usefulness  of  the  data 
collected. 

Conclusions 

The  specific  conclusions  which  may  be  drawn  from  this  research  may 
be  grouped  under  two  classifications.  First,  conclusions  relating  to  j 

the  data  collection  methodology  and  Its  potential  usefulness  are  pre- 
sented. Finally,  the  conclusions  relating  to  the  PRICE  S model  are 

3 

stated. 


Conclusions  Relating  to  the  Methodology.  It  was  found  that  the 
methodology  described  above  could  be  used  to  objectively  measure  some  of 
the  distinctive  characteristics  of  software  systems. 


mbBBh/tR 


It  was  found  that  the  data  collected  using  the  methodology  could 
be  used  for  cost  estimating  purposes. 

It  was  found  that  both  the  statistical  and  subjective  Information 
collected  provided  meaningful  Insights  Into  specific  software  systems. 
These  Insights  may  be  used  to  draw  analogies  between  systems  for  cost 
estimating  purposes. 

It  was  found  that  the  sequential  sampling  technique  used  to  gather 
statistical  data  about  software  systems  did  substantially  lower  the 
amount  of  work  required  to  calculate  the  desired  descriptive  parameters. 

No  way  could  be  found  to  eliminate  the  need  to  count  Individual 
machine  Instructions  within  the  modules  sampled. 

It  was  found  that  an  attempt  should  have  been  made  to  determine 
objective  values  for  "fraction  of  total  system  capacity  utilized." 

This  data  might  have  been  collected  from  existing  correspondence  or 
formal  reports.  In  some  cases,  researchers  may  attempt  to  estimate  this 
parameter  Independently  by  analyzing  system  "load  maps"  or  test  reports. 
Future  researchers  should  modify  their  data  collection  efforts  to  Include 
the  additional  effort. 


In  general.  It  was  found  that  while  the  data  collection  methodology 
worked.  It  was  cumbersome.  TWo  man-weeks  or  more  are  required  to  gather 
data  on  a single  avionics  software  system  of  about  300  modules  using 
the  methodology  described  above.  Mhlle  this  may  be  feasible  for  systems 
In  this  size  range,  the  researcher  cannot  recosnend  that  this  methodology 
be  applied  to  large  systems  containing  thousands  of  modules. 

Conclusions  Relating  to  PRICE  S.  The  following  conclusions  may  be 
stated  with  respect  to  the  RCA  PRICE  S software  cost  estimation  model: 


It  was  found  that  historical  ASD  software  coat  data  Is  not  Incan- 
pa tibia  with  tha  PRICE  S nodal. 

It  was  found  that  nost  of  tha  PRICE  S Input  paranatars  ara  ananabla 
to  objactlva  naasuranant.  Tha  axcaptlons  to  this  finding  ara  "ays tan 
daslgn  conplaxlty"  (SDCPLX)  and  "anglnaarlng  conplaxlty"  (ENCPLX),  which 
nust  ba  ralatad  to  objactlva  raallty  through  anplrlcal  stadias  (call- 
Oration). 

Finally*  It  was  found  that*  In  practlca*  tha  PRICE  S nodal  Is  quits 


flaxlbla.  This  Is  both  a strangth  and  a wsaknass.  Tha  nodal  saans  to 
hava  tha  Flaxlblllty  to  ba  usaful  In  a wlda  ranga  of  situations.  This 
smo  flaxlblllty,  howavar.  nakas  It  doubtful  that  tha  nodal  should  ba 
usad  Indapandantly  of  othor  cast  astlnatlng  tachnlquas.  Tha  assnaptloas 
raflactad  In  tha  Input  paranatars  nust  ba  continually  qusstlsnad  and 
conparad  to  objactlva  raallty  IF  tha  astlnatss  dtrlvad  Fran  tha  nodal 
ara  to  ba  rail ad  upon. 


, 


nTrnrni  » i«-  *i 

TLIKm. 


rJiruiM 


1 4 • « 1 1 jH 

'nnrr«^in?ni 


o 


APPENDIX  A 


A Glossary  of  PRICE  S Terminology 

ENCPLX  (Engineering  complexity)  Relates  difficulty  of  task  to  time 
schedule  for  completion. 

FCONST  Empirical  factor  describing  the  extent  of  echeloning  (skew- 
ness) of  the  hierarchical , functional  structure. 

FUNCT  Total  number  of  functional  modules. 

IDENS  (Instruction  density)  Describes  the  mixture  of  Instruction 
types.  The  weighting  factor  used  to  classify  application 
types. 

INSPF  Average  number  of  machine  level  Instructions  per  functional 
module. 

LEVEL  Mean  hierarchical  level.  Average  tree  structure  level  at 
which  functional  modules  are  defined. 

NIX  (Application  mix)  Describes  software  profile  In  terms  of 

total  code  In  each  of  seven  application  types:  mathematical 
applications,  string  manipulation,  system  operation,  data 
storage  and  retrieval,  on-line  communications,  real-time 
conmnd  and  control,  and  Interactive. 

SDCPLX  (System  design  complexity)  Relates  scope  of  work  to  the 
engineering  group  doing  the  work. 

TNINST  Total  n«ber  of  machine-level  Instructions. 

UTIL  Fraction  of  total  system  capacity  utilized. 

(RCA  PRICE  SYSTEMS,  Inc.,  1977) 

•O' 

82 


i 


o 


APPENDIX  B 


The  Initial  Interview  Format 


D 


83 


INITIAL  INTERVIEW  WORKSHEET  (Page  2 of  k) 

System  Identifier i Datei  

QUALITATIVE  QUESTIONS 

Is  this  a stand-alone  system*  or  part  of  a distributed  processing 
software  system?  


If  part  of  a distributed  system,  does  it  perform  a single  (or 
homogeneous  group  of)  function(s)?  


Does  this  system  share  the  host  computer  with  another  system? 


Describe  the  pertinent  characteristics  of  the  host  computer. 

Core  Sizet  Word  Sizes  KOPSt  

Other  (SPECIPY)i  


Was  the  development  schedule  either  unusually  long  or  short? 


Was  the  group  which  developed  the  software  system  exceptionally 
well  qualified  or  inexperienced  with  this  type  of  software  system? 


n*D. 

r • 


AD-A046  806 


UNCLASSIFIED 


AIR  FORCE  INST  OF  TECH  WRIGHT-PATTERSON  AFB  OHIO  SCH— ETC  F/G  9/2 
A PRELIMINARY  CALIBRATION  OF  THE  RCA  PRICE  S SOFTWARE  COST  ESTI— ETC(U) 
SEP  77  J SCHNEIDER 

AFIT/6SM/SM/77S”15  NL 


2 of  2 

AO 

A046808 


12-  77 


o 


INITIA1  IltTERYIEW  WORKSHEET  (Pa««  3 e t 4) 


System  Identifier 1 Dmtei  _ j 

QUALITATIVE  QUESTIONS  (CONT) 

Did  the  developer  use  any  of  the  new  software  development  method- 
ologies? If  so , which  ones?  Did  he  have  previous  experience 
with  them?  _ j 

i 

* '*’■*"  .-1.  ' V..  •■■,.- ■ V.F.- 

Were  the  system  specification  requirements  either  unusually 

stringent  or  lax?  If  so,  in  what  way?  j 

1 


M 

J 

Did  the  development  either  approach  or  extend  the  state-of-the- 

jj 

art  in  any  way?  If  so,  how? 


Was  a significant  portion  of  the  design  completed  before  work 
officially  began  on  the  contract?  If  so,  describe  the  situation. 
......  ■ 


Percent  of  processor  capacity  utilised i Memory  

Processing  Time  I/O  Channels  

Others  above  $0%  (Specify) t 

Total  number  of  devices*  by  type*  that  interact  with/are  control- 
led by  the  software  system  (Identify  unique  devices).  ______ 


What  COST/SCHKDUUE/MANPOWBR  LOADING  information  is  available? 


- 


87 


APPENDIX  C 


ling  Equation 


Tivitlon  of  the  Sequential 


Objective 


Determine  the  sample  size  n required  to  assure  that  the  bounds 
of  a 95  percent  confidence  Interval  for  mean  module  weight  w lie 
within  ten  percent  of  w. 

Mathematically,  this  may  be  stated: 


w - The  sample  mean  of  the  random  variable  "module 
weight." 

y — « The  true  mean  module  weight  of  the  population. 
P « The  probability  function. 


Given:  (1)  A finite  population,  the  software  system,  containing 

N elements  (modules),  each  of  which  Is  equally  likely  to  be  chosen  In 
the  sampling  process. 

(2)  Each  module  has  a unique  value  of  a random  variable 
called  weight,  w associated  with  It.  Weight  Is  calculated  by  multi- 
plying the  number  of  Instructions  In  the  module  by  a weighting  factor 
which  Is  determined  by  the  application  type  of  the  module. 


The  sample  size  n Is  large  enough  for  the  Central  Limit  Theorem 
to  be  invoked.  This  assumption  Is  discussed  below,  following  the 
derivation  Itself. 


Dari  vat  Ion 

Assume  that  a sample  of  n modules  Is  drawn,  without  replacement, 
from  a software  system  containing  a total  of  N modules.  Let: 


Height  of  the  1TO  module 
Sample  size 

The  sample  mean  of  weight 
The  population  mean  of  Yn 
The  population  mean  of  the  w{ 
The  variance  of  Yn 


Then,  by  the  Central  Limit  Theorem,  the  distribution  of  the  random 


variable 


approaches  the  standard  normal  distribution  as  n approaches  Infinity 
(Conover,  1971:53-54) 

Substituting  equations  (a),  (b),  and  (c)  Into  equation  (d)  yields 


4 


MHIbbt 


Now,  according  to  Freund,  If  a random  sample  is  drawn,  without 

S:;. 

replacement,  from  a finite  population,  then  the  variance  of  the  sample 
mean  Is  given  by: 

mm  • Sj.  to].  (Fr,und.  1971:204)  (f) 

If  n Is  sufficiently  large  for  the  Central  Limit  Theorem  to  be 
invoked,  then  the  upper  bound  of  a 95  percent  confidence  Interval  Is 
defined  by  the  condition: 

zu  - 1.96 

Where:  zu  * The  upper  bound  of  the  confidence  Interval. 

Since  the  normal  distribution  Is  symetrlc  about  the  mean.  It  Is  suf- 
flclent  to  examine  the  upper  limit  only.  Analogous  results  follow  for 
the  lower  bound. 

Finally,  the  requirement  that  the  bounds  of  the  confidence  interval 
lie  within  ten  percent  of  the  sample  mean  Implies  that  at  the  upper 
boundary,  the  condition: 

w - u*  • 0.1  w (h) 

must  hold.  Again,  because  of  symmetry.  It  Is  not  necessary  to  consider 
the  lower  bound  separately. 

Applying  the  boundary  conditions  of  equations  (g)  and  (h)  to 
equation  (e),  and  then  substituting  equation  (f)  Into  equation  (e) 
yields: 


O 


91 


1.96 


(D 


0.1  w _ 

H-n 

O * nfiPTT 

This  squat Ion  daflnas  tha  conditions  necessary  for  the  upper  bound  of  a 
95  percent  confidence  Interval  to  lie  precisely  ( .1  w)  from  w.  By 
squaring  both  sides  of  equation  (1),  solving  for  n,  and  replacing  w 
with  the  unbiased  estimator  It  follows  that: 


This  Is  the  equation  to  be  used  In  determining  the  sample  size 
required  for  the  statistical  data  collection  described  In  Chapter  III 
above. 

Use  of  the  Central  Limit  Theorem 

Conover  states  that  the  Central  limit  Theorem  may  be  invoked  when 
drawing  without  replacement  from  a finite  population  provided  that  the 
population  size  N Is  a least  twice  the  sample  size  n.  (Conover,  1971: 
54)  There  Is  no  discussion  of  whether  the  theorem  may  be  applied  for 
larger  sample  sizes  than  (N/2).  Since  both  systems  studied  required 
sample  sizes  that  exceeded  this  restriction  (based  on  equation  (j))  the 
applicability  of  the  Central  Limit  Theorem  must  be  addressed. 

There  Is  a very  definite  relationship  between  the  sum  of  n random 
variables  drawn  without  replacement  from  a finite  population,  and  the 
sum  of  the  (N-n)  variables  which  were  not  drawn.  (Failure  to  be 
selected  may  be  considered  a form  of  selection.) 

o ■m&Mm 


92 


In  fact*  It  nay  be  shown  that: 


l“r,n  " H)p  wr,(N-n) 

Where:  y„  - ■ the  r ^ moment  about  the  mean  of  the  st»  of  the 

r til 

n variables  drawn,  and 

ur,(N-n)  * the  rth  moment  about  the  mean  of  the  (N-n) 
variables  not  drawn. 

From  the  above.  It  may  be  Inferred  by  symmetry  that  the  distribu- 
tion of  the  sum  of  a sample  of  size  (N-n)  approximates  tht  normal  dis- 
tribution precisely  as  well  as  that  of  a sample  of  size  n.  Thus,  it 
would  seem  that  the  operative  constraint  on  the  invocation  of  the 
Central  Limit  Theorem  Is  that  both  (N-n)  and  n must  be  "large.”  This 
condition  Is  less  restrictive  than  the  "50  percent  rule"  cited  above. 
For  both  of  the  systems  sampled,  both  (N-n)  and  n exceeded  100, 
therefore  the  Central  Limit  Theorem  was  properly  Invoked. 


SYSTEM  NAME  i SYSTEM  A 
PRIMARY  FUNCTION!  _J 
MAJOR  SOB-FUNCTIONS » 


DATE  i 11  SEPT  1977 


TOTAL  COST i 1.054.000  ACCOUNTING  ASSUMPTIONS/PROBLEMS t Single 


software  development  account,  plus  small  IV&V  contract.  Systems 


ring  costs  not  Included.  Burden  In,  G&A  out 


START  DATE*  JAN  75  END  DATEi 

ESTIMATED  NORMAL  TIME  FOR  DEVELOPMENT! 

QUALIFICATIONS  OF  WORK  GROUP*  HIGH  7 6 5 4 3 2 1 LOW 
PAY  SCALE  OF  WORK  GROUP*  HIGH  76  54321  LOW 


38  Months 


VELOPMENT  NOT  COMPLETED  WHEN 


LANGUAGE 

NUMBER 

OF 

MOStJLES 

ASSEMBLY 

307 

GROUND  (FIXED  OR  MOBILE) 

NUMBER  MODEL  WORD 

MEMORY  PER  CENT 

AVIONICS  (CIVIL  OR  MILITARY) 

SIZZ 

SIZE 

CAPACITY 

UTILIZED 

SPACE  (MANNED  OR  UNMANNED) 

1 4-PI  32 

16K 

50-60 

1 LC-4516  16 

8K 

50-60 

-10  . M »7:V  M 


SYSTEM  E 


SYSTEM  NAME i 

PRIMARY  FUNCTION*  NAVIGATE  AND  CONTROL  A GUIDED  MUNITION 

MAJOR  SUB-FUNCTIONS i DETERMINE  WEAPON  LOCATION.  CALC  GUIDANCE  ERROR 


CALC  LAUNCH  WINDOWS.  INITIATE  WEAPON  RELEASE 


PLATFORM 

GROUND  (FIXED  OR  MOBILE) 
AVIONICS  (CIVIL  OR  MILITARY) 
SPACE  (MANNED  OR  UNMANNED) 


NUMBER  MODEL  WORD  MEMORY  PER  CENT 
SIZE  SIZE  CAPACITY 
UTILIZED 

1 32  1GK  HIGH 


10URCE 


ASSEMBLY  297  4.0 

INSTRUCTION  MIX 

DATA  ON-LINE  REAL-TIME  INTERACTIVE 


VARIES 


TOTAL 


TOTAL  COST*  1,800,000  ACCOUNTING  ASSUMPTIONS/PROBLEMS t PRIME 
TASK  ON  SINGLE  CONTRACT.  COST  OF  OTHER  TASKS  BACKED  OUT.  BURDEN  INCLUDED 
G&A  EXCLUDED 


START  DATE*  APRIL  1974  END  DATE 

ESTIMATED  NORMAL  TIME  FOR  DEVELOPMENT! 

QUALIFICATIONS  OF  WORE  GROUP*  HIGH  7 6 5 4 3 2 1 LOW 
PAY'  SCALE  ^ OF  'WORK.  GROUP  i HIGH?  6 5 4 3 2 1 LOW 

ASSUMPTIONS/LIMITATIONS/COMMENTSi  UTILIZATION  OF  CAPACITY  NOT 
MEASURED.  THERE  HAS  PRESSURE  TO  SHORTEN  DEVELOPMENT  SCHEDULE. 


30  MONTHS 


Captain  John  Schneider  IV'was  born  on  October  31,  1945  in  Greensboro 
North  Carolina,  but  was  raised  in  Cambridge,  Maryland.  He  attended  the 
University  of  Maryland  and  received  a Bachelor  of  Science  degree  in 
Physics  from  that  institution  in  1968.  Having  participated  in  the 
Reserve  Officer  Training  Corps  pro*  *am  as  an  undergraduate,  he  entered  the 
Air  Force  as  a second  lieutenant  upon  graduation. 

From  1968  through  1971,  Captain  Schneider  was  assigned  to  the 
1 Aerospace  Control  Squadron  (ADC)  In  „olorado  Springs,  Colorado,  where 
he  was  responsible  for  developing  and  Implen.-nting  techniques  for 
tracking  and  predicting  the  position  of  hi^nly  eccentric  and  geo- 
stationary earth  satellites.  After  a short  tour  in  Thailand,  Captain 
Schneider  was  reassigned  to  the  Colorado  Springs  area  in  late  1972. 

For  the  next  few  years,  he  worked  as  a computer  programmer  on  several, 
astrodynamic  computer  programs  for  the  NORAD  Space  Defense  Center  (SDC) 
and  acted  as  a technical  consultant  on  the  software  system  for  the  Space 
Computational  Center  (the  follow-on  facility  for  the  SDC).  During  the 
year  ending  in  June,  1976,  Captain  Schneider  organized  and  managed  a 
group  of  military  and  civilian  computer  programmers  who  were  responsible 
for  developing  a sub-system  of  astordynamic  command  and  control  pro- 
grams for  the  Space  Computation  Center. 

Captain  Schneider  is  currently  assigned  to  the  School  of  Engineering 
Air  Force  Institute  of  Technology,  where  he  is  pursuing  a Master  of 
Science  degree  in  Systems  Management. 

Captain  Schneider  Is  married  to  the  former  Miss  Susan  Lake  of  East 
New  Market,  Maryland.  They  have  two  children,  Laura  and  John. 

Permanent  address:  107  Mill  Street 

Cambridge,  Maryland  21613 


UNCLASSIFIED 


M.S.  Thesis 


AODREWrif  dtttef n<  hem  Controlling  Off  lee) 


REPORT  DOCUMENTATION  PAGE 


,RFIT/GSN/SH/77S-1 


I.  TYRE  or  REPORT  * PERIOD  COVERED 


ARY^fALIBRATlON  OF  THE jffii  PRICE 
ARE^OST  JsTIWtlON  HODEL , — 


*.  PERRORMINO  ORO.  REPORT 


I ART  HUM 


inelder,  IV 


NT.  PROJECT.  T ASK 
UNIT  NUMRCRS 


t.  PERRORMINO  ORO  ANIMATION  NAME  AND  ADORES* 

Air  Force  Institute  of  Technology  (AFIT/EN) 
Wright- Patters  on  AFB,  Ohio  45433 


»l.  CONTROLLING  OPPICE  NAME  AND  ADDRESS 

Air  Force  Institute  of  Technology  (AFIT/EN) 
Nrlght-Patterson  AFB,  Ohio  45433 


It.  SECURITY  CLASS,  (of  W# 

UNCLASSIFIED 


Approved  for  public  release;  distribution  unlimited 


•7.  DISTRIBUTION  STATEMENT  (of  A*  PfMel  «Nn<  in  Black  SO,  11  Mfotont  from  Report) 


public  release;  IAW  AFR  190-17 

pt,  USAF 
tor  of  Inforaiatlon 


Cost  Estimates 
Data  Acquisition 
Computer  Programing 


Each  year,  the  Department  of  Defense  spends  more  than  three  billion  dollars  on 
computer  software,  yet  software  managers  are  notoriously  unable  to  predict  the 
cos • of  software  development  projects.  This  Is  especially  true  of  preliminary 
cost  estimates  made  during  the  formative  stages  of  a project.  Even  <*hen  para- 
metric relationships  are  used,  such  estimates  depend  heavily  on  a no logy  with 
previously  developed  systems.  The  purpose  of  this  research  Is  to  Investigate 
ways  of  gathering  and  using  descriptive  data  for  the  purpose  of  making  pre- 
liminary software  cost  estimates.  A methodology  for  the  collection  of. 


i jam  n 


OR  t MOV  M » OMOLETE 


